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Introduction 


by 
MAX ALBERT 


What is scientific competition? When this question is posed by an economist, 
many people think they already know what the answer must be: science is a 
market of ideas, and scientific competition is like market competition. Sur- 
prisingly, the economics of science! gives quite a different answer. 

Of course, a certain part of science, called commercial or proprietary sci- 
ence, is a market of ideas. In proprietary science, the results of research are 
protected by intellectual property rights, mostly patents or trade secrets; they 
can be bought and sold, and their market value derives from the market value 
of the goods they help to produce. Moreover, the expected market value of an 
idea provides the incentives for investments in research. 

Competition in proprietary science is not like market competition; it is 
market competition. In contrast, scientific competition means competition 
within academic or open science and its institutions: learned societies, scien- 
tific journals, the peer review system, Nobel prizes, and modern research- 
oriented universities. 

In open science, ideas are not protected by intellectual property rights. 
Contributions to open science are published, and the ideas they contain can be 
used free of charge by anybody who wishes to do so. Although these ideas are 
nobody’s property in a legal sense, their use is regulated by moral rights or 
norms. Researchers morally “own” results if they were the first to publish 
them (the so-called priority rule, see Merton 1973); they have a moral right, 
then, to be cited by those using their results. The extent to which a 
researcher’s ideas are used by others determines the researcher’s status in the 
scientific community (Merton 1973, Hull 1988). Status is not only a reward on 
its own (Marmot 2004), but also the key to other, material rewards in open 
science. Just like patents in proprietary science, then, the norms of open sci- 
ence generate incentives to invest in new ideas. 

Is open science a market of ideas? There are certainly many similarities. In 
open science as in markets, we observe production, division of labor, spe- 
cialization, investments, exchange, risk-taking, competition but also cooper- 
ation, and so forth.” However, these are aspects of almost all human 
endeavors. It is more informative to look for differences. The most important 
difference is that both institutions use different mechanisms of collective 


' For surveys, see Diamond (1996, forthcoming), Stephan (1996, forthcoming). 
? On differences and similarities between competition in science and on markets, see 
Walstad (2002). 
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decision making. Markets use the price mechanism. Open science uses a 
sophisticated version of the voluntary contributions mechanism based on 
competition for status. 

Many collective decisions are made through voluntary contributions, from 
the cleanliness of public spaces, which is largely determined by voluntary 
individual effort, to the financial volume of private disaster relief. Usually, 
voluntary contributions determine only the supply of some good. The special 
twist of scientific competition is that the voluntary contributions mechanism 
regulates both, supply of and demand for research. 

Looking at the supply side, we find that researchers in open science are not 
paid for each contribution. They receive a lump-sum salary that covers 
research and, possibly, other activities, notably teaching, but in the short run 
neither this salary nor other possible rewards vary with the number and 
quality of their contributions. Since, in most cases, nobody demands a specific 
contribution, individual contributions are voluntary, unsolicited, and unpaid. 

The motives behind volunteering are well-known. We can distinguish 
between consumption and investment motives. Consumption motives are 
enjoyment of one’s work, reciprocity or altruism (which are similar to 
enjoyment), and the striving for recognition and status, especially among 
insiders. In the case of science, curiosity is often mentioned, which is an aspect 
of enjoyment. Enjoyment of work usually requires the freedom to choose 
one’s tasks and the absence of control, which are characteristics of open sci- 
ence. Investment aspects are networking, building human capital, and sig- 
naling one’s ability. In the case of science, signaling one’s ability goes hand in 
hand with acquiring status among insiders; it does not matter whether one 
emphasizes the investment or the consumption aspect. 

Looking at the demand side, we see that the scientific community decides, 
in a decentralized way, about a contribution’s success. Science is cumulative: 
one researcher’s output is the next researcher’s input. A successful con- 
tribution is one that is used by other researchers as input for their own 
research. The more it is used, the higher the success. Citation statistics and 
impact factors are relevant because they measure the use of ideas.* 

Researchers in open science compete in providing inputs for their peers. If 
they want to be successful, they must anticipate what kind of input other 
researchers would like to use; their success depends on the decisions of their 
peers. This mechanism should not be confused with peer review. Peer review is 
used to select among research proposals that compete for funding, or among 
papers that compete for publication in prestigious journals. It is a secondary 


3 See the overview in Hackl et al. (2005), partially published in Hackl et al. (2007). 

4 Though only very approximately: important ideas are used without citation when 
they have become textbook knowledge; on the other hand, many citations do not indi- 
cate use of ideas but only demarcate the contribution of a paper. 
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selection mechanism that tries to deal with the scarcity of funds or of atten- 
tion. The primary selection mechanism - selection of inputs for further 
research — could work without peer review, although possibly less efficiently. 

Why scientific competition? Traditionally, economists have taken it for 
granted that the price mechanism is the only efficient mechanism of collective 
decision making. From this point of view, scientific competition should be 
replaced by the price mechanism. However, with the rise of the new institu- 
tional economics (see Furubotn and Richter 2005) and its integration in the 
economic mainstream, the traditional view has lost its plausibility. Economists 
have learned that markets are not always better than hierarchies, and that 
majority voting may be ex ante efficient. Similarly, the economics of science 
started with an argument against the price mechanism. 

In their pioneering contributions, Nelson (1959) and Arrow (1962) analyzed 
the shortcomings of the price mechanism in scientific research: The exclusion 
of potential users of an idea is inefficient because additional users create no 
additional costs. Even with patent protection, the returns on investment in 
research can be appropriated only to some extent. The outcomes of research 
are highly unpredictable; thus, researchers will need insurance, but insurance 
dilutes the researchers’ incentives. Consequently, investment in research and 
utilization of its results will typically be too low. Moreover, results will 
sometimes be kept secret, which impedes further research. These problems 
will be more pronounced for basic than for applied research. 

With respect to basic research, Nelson and Arrow considered open, or not- 
for-profit, science as a solution, without, however, analyzing it in detail. This 
was done by Dasgupta and David (1994). At the heart of their argument for 
open science is a massive delegation problem. In basic research, employers of 
researchers lack the knowledge to judge the quality of research results and, 
consequently, the achievements of researchers. They cannot effectively mon- 
itor the efforts of researchers, and they cannot judge the results of these 
efforts. Hence, they cannot hire researchers on the basis of incentive contracts 
that condition payment on the quality of results. Scientific competition solves 
this delegation problem. It provides incentives to researchers and generates 
evaluations of researchers (i.e., scientific reputations) and of research results 
(i.e., extent of use by the scientific community) that can be observed and used 
by employers. Indeed, these achievements of scientific competition may 
explain the existence of open science (David 1998, 2004). 

Why care about scientific competition? European science policy seems 
currently to be fixated on the idea that promoting competition between uni- 
versities is the key to improvements in the European system of scientific 
research (see, e.g., EU Commission 2003, 2005). 

Historically, however, university competition has been neither sufficient nor 
necessary for the flourishing of scientific research. The successes of the 19th 
century Prussian university system were, to a large degree, due to central 
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ministerial control — the so-called “System Althoff”, named after the 
responsible civil servant. With the help of a network of personal contacts, 
Althoff extracted the information circulating in the scientific community and 
used it to hire young scientific high-potentials and to reward renowned 
researchers. Thus, the ministry circumvented university competition and, 
instead, made use of and promoted scientific competition. This central-plan- 
ning regime was preceded by a very competitive decentralized system where 
universities competed for student fees. Every employee, from the professor to 
the caretaker, got their share: a textbook case of incentive pay. However, in 
this system, the scientific standards of university education were very low, and 
universities played no role in research.’ 

The point of these historical facts is, of course, not that central planning 
works better than competition, but that scientific competition is more 
important than university competition. 

Scientific competition provides common pool resources for universities:® 
incentives for researchers to do research and to conform to scientific stand- 
ards; evaluations of research results, which are used by universities for the 
development of academic curricula; and evaluations of researchers, which are 
used by universities for hiring and promotion decisions. These resources are 
only available, however, if universities allow their academic staff to participate 
in scientific competition. 

Competition between users of acommon pool resource easily leads to over- 
exploitation. Consider, for instance, the following plausible scenario. Uni- 
versities compete for the services of renowned researchers, who get contracts 
that allow them to do their own research. Less renowned researchers have less 
bargaining power, and administrators put them to other uses: teaching, 
administration, and research that is profitable to the university but of no 
scientific interest. This is rational from the administration’s point of view. 
However, scientific competition requires that researchers decide collectively 
about reputations, by accepting or rejecting new ideas as inputs for their own 
research. If universities want to employ researchers who have earned a rep- 
utation in this process, they must collectively bear the costs of letting other, 
less renowned researchers participate. Yet, each university is better off if it 
makes use of scientific competition without bearing its share of the costs. In 
this scenario, university competition will destroy scientific competition. 

This is not the place to evaluate current policies. Our concern here is with 
the scientific basis of these policies, which fails to take scientific competition 


5 See Clark (2006) and, specifically on the “System Althoff”, Vereeck (2001). See 
Burchardt (1988, 185) for an example for the distribution of fees from the university of 
Berlin, and this university’s statutes, Statuten der Friedrich-Wilhelms-Universität in Berlin 
v. 31.10.1816, which were typical for the time. I am am obliged to Lydia Buck for bringing 
these historical facts and the relevant literature to my attention. 

é On common pool resources and their governance, see Ostrom (1990). 
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into account. The EU commission (2003, 2005), for instance, never mentions 
scientific competition, under this or a different name. This is like reforming 
capitalism and forgetting about the price mechanism. It is hard to believe that 
successful policies can be developed on such a basis. 


The Contributions to this Volume 


The papers in this volume deal with core aspects of the theory and policy of 
scientific competition. They have all been presented and extensively discussed 
at a conference in Saarbrücken in October 2005. They appear here in revised 
form, together with the revised versions of the comments that were also 
presented at the conference. 

The economics of science has always been an interdisciplinary undertaking. 
Economists have learned much from sociology (see esp. Merton 1973). 
Problems of intellectual property rights are discussed by lawyers and econo- 
mists. There are also strong connections between the philosophy of science, 
which has taken an institutionalist turn with the work of Karl Popper, and the 
economics of science (H. Albert 2006). The present volume continues the 
interdisciplinary tradition and contains contributions from economics, law, 
philosophy of science, political science, and sociology. 

The first four papers are concerned with supply-side considerations: the 
supply of researchers and their productivity. Paula Stephan starts from the 
observation that employment conditions in science have changed. Today, the 
prerequisites for productive research — access to equipment and colleagues, a 
certain degree of autonomy, job or funding security — are often missing. An 
increasing percentage of young researchers get stuck in laboratory jobs where 
they are not doing their own research. These employment conditions will 
reduce the future supply of young researchers since the current generation’s 
experiences influence the next generation’s expectations. The current system 
of research may not be sustainable, then, since it requires a large supply of 
young researchers motivated by the expectation of getting one of the research 
positions that are becoming increasingly scarce. 

Giinther Schulze also looks at the supply of researchers, but from a very 
different perspective. He analyzes the supply of university professors through 
the states in a federal system. The number of professors is an important part of 
educational services; indeed, Schulze treats this number as a proxy for edu- 
cational services. He shows that states have an incentive to attract high school 
graduates from other states by providing capacity in tertiary education, 
thereby free riding on educational services provided in the primary and sec- 
ondary education by other states. Optimal tertiary education is less than 
proportional to the size of the jurisdiction. For Germany he shows current 
trends in provision of professors and the production of new professors, 
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proxied by the number of habilitations. He analyzes the differences in the 
relative number of professors, their determinants and the resulting cross 
border student migration for the German federal states. 

The next two papers are concerned with the measurement of productivity in 
science. Gustavo Crespi and Aldo Geuna consider the determinants of science 
research output (as measured by publications and citations) in the UK. They 
use an original dataset including information for the 52 “old” UK universities 
(which account for about 90% of research expenditure) across thirty scientific 
fields for a period of 18 years, from 1984/85 to 2001/02. On this basis, they 
investigate the relations between the investment in higher education and the 
research outputs, rejecting the model of a global science production function 
for the UK in favor of four significantly different production functions for the 
medical sciences, the social sciences, the natural sciences and engineering. 

While Geuna and Crespi look at the macroeconomics of scientific pro- 
ductivity, Michael Rauber and Heinrich Ursprung focus on the micro- 
economic aspects. They argue that a bibliometric evaluation of researchers 
should take life cycle effects and vintage effects into account, and demonstrate 
the crucial importance of these effects in a bibliometric study of the research 
behavior of German academic economists. On the basis of this study, they 
develop a simple ranking formula that could be used for performance-related 
remuneration and track-record based allocation of research grants. They also 
investigate the persistence of individual productivity, which is relevant for 
tenure decisions, and develop a faculty ranking which is insensitive to the 
faculty age structures. 

These supply-side considerations are followed by five papers that are con- 
cerned with specific institutional aspects of open science. Martin Kolmar 
compares open and proprietary science from a theoretical perspective. For the 
purposes of his paper, proprietary science is identified with research leading to 
patents. Open science is modeled as a contest for a prize (research grants, 
tenure, etc.), with the research output becoming a public good. Kolmar con- 
siders a case where the research results may be used to reduce production 
costs in an oligopolistic downstream market. Thus, the focus is on applied 
science, which is quite often viewed as the natural domain of proprietary 
science. Nevertheless, the patent system turns out to be inefficient, because 
the patent holder has an incentive to restrict the number of licenses too much 
and because incentives for research are too weak. Open science, on the other 
hand, may be efficient, and even when not, it may be second-best optimal. 

Christine Godt is also concerned with problems of the patent system. She 
questions, from a lawyer’s perspective, the view that the possibility of pat- 
enting actually provides incentives for a better technology transfer from 
research institutions to industry. The problem is that the accumulation of 
royalties through several stages of a typical innovation process — a phenom- 
enon called “royalty stacking” — eats up the profit margins on the downstream 
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market. Royalty stacking is a result of two distinct mechanisms, one propri- 
etary, the other contractual. The proprietary mechanism is rooted in the 
expansion of patents into the traditional domain of open science. The con- 
tractual mechanism is primarily due to the transition from sale contracts to 
lease contracts in the downstream market. In combination, these two 
mechanisms can impede the technology transfer when the royalty share 
becomes too large. 

Nicolas Carayol analyzes the theoretical basis of the so-called Matthew 
effect in science. This effect was proposed by Merton as an explanation of the 
typical career patterns in science. It assumes that early successes in science lead 
to a more successful career because successful young researchers get better 
jobs with better research opportunities. Thus, an outstanding career in science 
may be the result not of exceptional ability, but of accidental early success. 
Carayol explains the Matthew effect in a dynamic model of university com- 
petition. The basis of the effect is an externality between researchers: suc- 
cessful old researchers confer an advantage to their younger colleagues. This 
implies that young researchers who get jobs at high-reputation universities will 
go on to be more successful than their peers at low-reputation universities, 
which perpetuates the reputation differences between universities. 

Carayol’s model hints at a further important aspect of academic life. 
Externalities between researchers can be interpreted as access to research 
networks. The great practical importance of these networks becomes much 
clearer in Dorothea Jansen’s paper, which reviews the results of a large 
sociological research project under her direction. The project focuses on 
networks in astrophysics, nanotechnology and microeconomics, collecting 
data on existing networks and analyzing correlations between network 
properties like size and density on the one hand and success in research on the 
other hand. The European and German science policies actively promote such 
networks. Among others, the empirical results show the first consequences of 
these policies. 

Christian Seidl, Ulrich Schmidt and Peter Grösche present the results of an 
empirical investigation of the referee processes of economic journals. Peer 
review, and especially the referee process of scientific journals, is a central 
institution of modern open science. Seidl, Schmidt and Grösche argue that 
publications in refereed journals today serve mainly as quality signals, influ- 
encing personal advancement, research opportunities, salaries, grant-funding, 
promotion, and tenure. For this reason, they consider the validity, impartiality, 
and fairness of the referee process as very important. The literature, however, 
casts doubts on the idea that journal referee processes satisfy these require- 
ments. Their own investigation shows that authors in economics value com- 
petence and carefulness of the reports more than positive decisions by editors. 
Competence and carefulness, however, are often missing. Moreover, reports 
in economics often fail to help authors improve their manuscripts. 
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The volume concludes with two papers devoted to collective decision 
making in science. Jesu’s Zamorra Bonilla applies the perspective of con- 
stitutional political economy to methodological rules in science. Combining 
philosophy of science with game theory, he conceives of science as a game of 
persuasion in which competition for status forces scientists to accept meth- 
odological rules and to acknowledge the contributions of their competitors. 
On the basis of a specific model, he argues that mutual control in a scientific 
community ensures that the norms of science are followed frequently, if not 
perfectly. 

Christian List discusses collective decision making in science from a very 
different, non-competitive perspective, namely, social-choice theory. Drawing 
on models of judgment aggregation, he addresses the question of how a group 
of individuals, acting as a multi-agent cognitive system, can “track the truth” 
in the outputs it produces. He argues that a group’s performance depends on 
its “aggregation procedure” — its mechanism for aggregating the group 
members’ inputs into collective outputs; for instance, voting on the truth of 
propositions — and investigates the ways in which aggregation procedures 
matter. These considerations are highly relevant in connection with scientific 
committees that try, against the background of scientific competition with its 
differences of opinion, to formulate a scientific consensus, as, for instance, in 
the case of climate change. 

These eleven papers, with accompanying comments, highlight the diverse 
problems and questions turning up when we try to understand scientific 
competition. They also illustrate the breadth of contemporary economics of 
science, its many ties with neighboring fields, and its potential to improve 
science policies. 
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Job Market Effects on Scientific Productivity* 


by 
PAULA STEPHAN 


1 Introduction 


Much of the discussion in science policy circles today focuses on the question 
of whether the production of basic knowledge is threatened by a shift of 
emphasis in the public sector towards facilitating technology transfer. There 
are at least two variants of the crowding-out hypothesis. One variant argues 
that in the changing university culture scientists and engineers increasingly 
choose to allocate their time to research of a more applied as opposed to basic 
nature.! Another variant of the crowding-out hypothesis is that the lure of 
economic rewards encourages scientists and engineers (and the universities 
where they work) to seek intellectual property (IP) protection for their 
research results, eschewing (or postponing) publication, and more generally to 
behave more secretively than in the past.” Much of the work of Blumenthal 
and his collaborators (1996) focuses on the latter issue in the life sciences, 
examining the degree to which university researchers receive support from 
industry and how this relates to publication. A related concern is that the 
granting of intellectual property can hinder the ability of other researchers to 
build on a given piece of knowledge. This anti-commons hypothesis, articu- 
lated by Heller and Eisenberg (1998) and David (2001), postulates that the 
assignment of intellectual property rights discourages the use of knowledge by 
other researchers. 

How changing property rights in science affect the production of new 
knowledge is clearly of great relevance to the future of scientific productivity. 
But there are other reasons to be concerned about the production of scientific 
knowledge. This paper focuses on these. To wit: who will do science? Will they 
work in an environment conducive to doing research? The premise of the 
paper is that researchers’ productivity is affected by the environment in which 
they work and the conditions of their employment. For example, access to 


* This paper builds on the presentation that Stephan made at the conference “The 
Future of Science,” Venice, Italy, September 2005. The author would like to thank Grant 
Black, Chiara Franzoni, and Daniel Hall for their assistance. The author is indebted to 
Bill Amis, Chiara Franzoni, Bernd Fitzenberger, Christine Musselin, and Günther 
Schulze as well as participants at the conference on Scientific Competition for their 
useful comments. All errors are those of the author. 

! The model examined by Jensen and Thursby (2003) suggests that a changing reward 
structure may not alter the research agenda of faculty specializing in basic research. 

? Clearly, these two variants are not mutually exclusive. 
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equipment and colleagues clearly affect productivity. Productivity is further 
enhanced by researchers’ having a certain amount of autonomy. Moreover, a 
research horizon, facilitated by job security or funding security, encourages 
scientists to choose more risky projects than they might otherwise choose. 
And it doesn’t hurt if scientists work in such environments when they are 
young. Research consistently finds evidence of a relationship between age and 
productivity (Levin and Stephan 1991, Stephan and Levin 1992 and 1993, 
Jones 2005, Turner and Mairesse 2005). For what we might call journeymen 
scientists, the relationship is not pronounced. But for prize-winning research, 
there is considerable evidence of a strong relationship (Stephan and Levin 
1993). While it does not require extraordinary youth to do prize-winning 
work, the odds decrease markedly by mid-life. Stephan and Levin (1993) 
report that the median age that Nobel laureates commenced work on the 
problem for which they won the prize is 36.8 in chemistry; 34.5 in physics and 
39.0 in medicine/physiology for the first 92 years that the prize was awarded. 
For the more recent period, they find that the median age in chemistry is 38.5; 
in physics it is 36.0 and in physiology/medicine it is 35.0 (Stephan, Levin and 
Xiao, unpublished data). They conclude (1993, 397) “that regardless of field, 
the odds of commencing research for which a Nobel Prize is awarded decline 
dramatically after age 40.” Research opportunities for young scientists affect 
not only the productivity of the current generation of scientists. They also 
affect the scientific enterprise in years to come, since the supply of new sci- 
entists is responsive to the job opportunities and job outcomes that the current 
generation experiences. 

Historically, scientists and engineers received doctoral training with the goal 
of achieving a research position either at a university or, depending upon the 
country, a research institute. In some instances, scientists and engineers 
worked in large industrial research labs, although in the 20" century this 
pattern was more common in the U.S. than in Europe. 

In many western countries today young scientists face problems obtaining 
research positions that have characteristics conducive to doing good research. 
Here we discuss problems facing young scientists, drawing examples from the 
United States, Italy, and Germany. We also discuss factors contributing to the 
dismal job outlook faced by young scientists today. We focus on those working 
in the fields of the physical, life and mathematical sciences, as well as engi- 
neers, excluding those working in the social sciences from our discussion. 


Job Market Effects on Scientific Productivity 13 


2 Problems facing young scientists 
2.1 The situation in the United States 


Public sector research in the United States occurs primarily in the university 
sector, although some public research is produced at Federally Funded 
Research and Development Centers (FFRDCs) and at national laboratories, 
such as the National Institutes of Health. Within the university sector, by far 
the lion’s share of research is conducted at what are known as Research One 
institutions, institutions such as Harvard, MIT, University of Michigan, Uni- 
versity of Wisconsin, etc., classified by Carnegie as a “one” based on the 
amount of research funding that they receive and the number of PhD students 
that they educate. There is also a long tradition in the United States, as noted 
above, of scientists and engineers working in large industrial labs. Three 
noteworthy examples of such labs that flourished during the 20" century were 
those at Bell, DuPont and IBM. 

Graduate students in the U.S. have a strong tradition, albeit the tradition is 
field dependent, of aspiring to a tenure track position at a research university. 
A survey of U.S. doctoral students in the fields of chemistry, electrical engi- 
neering, computer science, microbiology and physics during the academic year 
1993-1994 found that 36% of the respondents aspired to a career at a 
research university; 41% aspired to a career in industry/government (Fox and 
Stephan 2001).? The preferences vary considerably by field; in microbiology 
and in physics more than 50% of the men preferred academic research 
positions as did 40% of the women surveyed. In chemistry and electrical 
engineering, which have a long tradition in the United States of employment 
in industry, a substantially lower percent prefer research positions in academe 
compared to research positions in industry or government. 

The university sector in the United States has been characterized by a 
tenure system that determines, within a period of no more than seven years, 
whether an individual has the option to remain at the institution or is forced to 
seek employment elsewhere (Stephan and Levin 2002, 419). If the individual 
receives tenure, s/he is promoted to the rank of associate and subsequently full 
professor if the research record continues to merit promotion. Prior to being 
hired as an assistant professor it has become increasingly common to take a 
postdoctoral position. 

The importance of tenure makes it crucial for young scientists to signal to 
older colleagues that they have the “right stuff” for doing research. A nec- 


3 The mail survey was administered by Fox to a national sample of 3800 doctoral 
students. The response rate was 62%. Respondents were asked “After receipt of your 
PhD, do you prefer to pursue an academic or nonacademic (industrial, government) 
career? The response categories were: (1) “academic with emphasis upon research;” (2) 
“academic with emphasis upon teaching;” and (3) “nonacademic.” 
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essary component of this signal is the ability to establish a lab of one’s own. 
And while startup capital is generally provided by the institution (Ehrenberg, 
Rizzo and Jakubson 2003), finding the funds necessary to run the lab (not only 
to buy supplies and equipment but also to hire graduate students, fund 
postdoctoral positions, and hire technicians) is the responsibility of the 
individual (Stephan and Levin 2002, 419). 

Typically the scientist applies to a research institute of the Federal gov- 
ernment for a research grant, although some resources for research come from 
the private sector (such as the Howard Hughes Medical Institute) and some 
(and an increasing portion) come from the university itself. In 2001, for 
example, 59% of the funds for research in the academic sector came from the 
Federal government; 7.1% came from state and local governments, 6.8% 
came from industry, 7.4% came from other places and 20% came from uni- 
versities themselves (National Science Board 2004, chapter 5). 

The field that has grown the most rapidly in the United States is that of 
biomedical sciences. Growth has occurred both in terms of the number of 
PhDs produced and the amount of funding available for research. For 
example, PhD production in the slightly broader area of the biological and 
agricultural sciences grew from 2711 in 1966 to 6798 in 2000 (National Science 
Foundation 2002). Funding from the National Institutes of Health doubled 
over a recent five-year period, going from $13.648B in 1998 to $27.181B in 
2003.* Here we examine the prospects of young PhDs trained in the bio- 
medical sciences in the United States to be hired into a permanent position at 
a Research One university, as well as their prospects to get funding. 

Figure 1 shows the dramatic increase in the number of PhDs age 35 or 
younger trained in the biomedical sciences in the United States. Data for the 
figure come from the Survey of Doctorate Recipients (SDR), a biennial 
survey overseen by Sciences Resources Statistics of the National Science 
Foundation and drawn from the sampling frame of the Survey of Earned 
Doctorates (SED), a census of all new PhDs in the U.S. We see that the 
number of PhDs 35 years of age or younger, trained in biomedical sciences in 
the United States, grew by almost 60% during the short interval of eight years, 
going from 11,715 to 18,671. We also see that the number of tenure-track 
positions has grown by only 7% during the same period, going from 1212 to 
1294. Thus, the probability that a young person trained in the biomedical 
sciences in the United States holds a tenure track position has declined con- 


* http://www.faseb.org/opa/ppp/fed_fund/NIH_funding_trends_4x13x04_files/frame. 
htm 

5 The SED is administered to all PhD recipients. The SDR is administered to a sample 
drawn from the SED. The tabulations presented here use weighted data from the SDR. 
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siderably in recent years, going from 10.3% to 6.9%.° When we focus on 
Research One institutions, we see a similar pattern. We estimate that 618 
PhDs age 35 or younger trained in the biomedical sciences held tenure track 
positions at Research One institutions in 1993 (5.3% of those 35 or younger). 
Eight years later, 543 (4.4%) held such positions. 


Figure 1 Biomedical PhDs Age 35 or Younger in United States 
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Source: Computations, SDR (see text). 


The situation is not limited only to those under 35, as is readily seen in 
Figure 2 (see p. 16), which shows the number of biomedical PhDs between 36 
and 40 in tenure track positions to be almost flat during the period. More 
generally, the number age 55 and under holding tenure track positions has 
been fairly constant over the eight-year interval; the only growth has been for 
those greater than 55 years old. 

Not surprisingly, young PhDs trained in the biomedical sciences are having 
difficulty garnering a first award from the National Institutes of Health, as 
shown in Figure 3 (see p. 17). While in 1979 NIH made awards to almost 1200 
principal investigators (PI’s) 35 or younger, by 2003 the number had declined to 
approximately 200 (National Academies of Science 2005). More generally, the 
average age at first major independent research support has increased from 37 


6 Increasingly faculty are hired into non-tenure track positions that have the title of 
assistant, associate or full professor. The number of young individuals holding such 
positions grew from 389 to 527 in 2001. Including this group with the tenure track group, 
the probability of being in a faculty rank position has declined from 13.7% to 9.7% 
during the 1993-2001 period for those 35 and younger. 
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Figure 2 Tenure Track Biomedical Faculty by Age: United States 
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Source: Computations, SDR (see text). 


in 1980 to 41.9 in 2002 for PhDs.’ The decline cannot be attributed to a lack of 
resources, given the tremendous amount of growth that occurred in the NIH 
budget during this period. Nor can it be attributed to a decline in supply of 
young investigators (see Figure 1 p. 15). Neither can it be attributed to the 
quality of the proposals submitted by those 35 or younger. NIH data indicate 
that the success rates for new funding are highest for those 35 and younger than 
for any age group; the second highest success rate is for those 36 to 40. Rather, 
the decline reflects the older age at which young researchers obtain a first 
permanent position from which they can apply for funding. The funding sit- 
uation was of sufficient concern for the National Academies of Science (NAS) 
to appoint a committee, chaired by Nobel laureate Thomas Cech, to study the 
issue. Their report, entitled “Bridges to Independence,” was issued in 2005. 
More generally, the success patterns reflect the changing composition of 
PhD employment at U.S. universities. Specifically, universities increasingly are 
hiring more part-time and non-tenure-track faculty; they employ more and 
more post doctorates and staff scientists. For example, the percent of bio- 
medical PhDs working at universities and employed in non-tenure-track 
positions grew from 26% to 33% in the eight-year period 1993 to 2002. This 


7 First independent research support consists of either an R01 grant or, in earlier 
years, an R29 award. 

8 Researchers typically hold a position for two or three years before submitting a grant 
proposal. One reason for this is that the grant application must show evidence relating to 
prior results. 
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Figure 3 National Institute of Health Awards To Those 35 and Under, 
United States 
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Source: National Academies of Sciences (2005). 


matches a national trend across disciplines and universities. Figure 4 (see 
p. 18) shows the ratio of full-time non-tenure-track faculty to full-time faculty at 
Research One institutions (Ehrenberg and Zhang 2005, table 3A.1). The data 
are displayed for both public and private institutions. In both instances, we see 
a substantial increase over time. For example, at public institutions, the ratio, 
which was .245 in 1989, had climbed to .375 by 2001; in private institutions it 
had started at .312 and eventually increased to .434 by the year 2001.° 

It should be noted that postdoctoral appointments are usually not included 
in this data since the postdoctoral position is generally classified as a training 
position and hence is generally not processed as a hire. During this interval, 
the number of individuals working in postdoctoral positions has increased 
dramatically (Ma and Stephan 2005), going from 23,000 in 1991 to 30,000 in 
2001.'° Ma and Stephan find the propensity to take a postdoctoral position to 
be inversely related to demand for positions in academe. For example, they 
find the probability to be negatively and significantly related to the per cent 
change in current fund revenue for institutions of higher education.!! 


° The tabulations are based on data from the biennial IPEDS Fall Staff Surveys. 

10 Richard Freeman (unpublished presentation) estimates that the ratio of post- 
doctorates to tenured faculty positions in the life sciences went from .54 in 1987 to .77 in 
1999, an increase of 43%. 

1! They also find the propensity to be positively related to the size of the PhD’s cohort, 
suggesting that other things equal, as supply of new PhDs increases, recent PhDs are 
more likely to take postdoctoral positions. 
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Figure 4 Full-time Non-tenure-track Faculty/Total Full-time Faculty 
at Research One Institutions: United States 
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Source: Ehrenberg and Zhang (2005). 


Several factors explain these hiring trends. First, cutbacks in public funds 
and lowered endowment payouts clearly affect hiring. Second, salaries of 
tenure-track faculty are higher than those of non-tenure-track faculty and 
research shows (Ehrenberg and Zhang 2005) that this leads to a substitution 
away from tenure-track positions. Third, funding for non-permanent positions 
such as staff scientist is available in research grants. The high cost of start-up 
packages also plays a role in explaining these trends. A survey of start-up 
packages by Ehrenberg, Rizzo and Jakubson (2003) finds that private 
Research One institutions spend on average $403,071 on the start-up packages 
for assistant professors, while public Research One institutions spend on 
average $308,210. Given these sums, when universities do hire in the tenured 
ranks, they are tempted to recruit senior faculty away from another university, 
rather than hire an as yet untested junior faculty member. The financial risk is 
considerably lower. While the start-up packages are generally higher at the 
senior ranks, the university gets an immediate transfer of grant money, 
because the senior faculty generally bring existing research grants with them 
when they come. 

Despite this situation, many young scientists persist in aspiring to a tradi- 
tional academic career. Geoff Davis’s (2005) recent survey of postdocs found 
that the overwhelming majority of those looking for a job, were “very inter- 
ested” in working at a research university.” While any sample of postdocs is 


12 Davis reports that 1110 of the 2770 respondents indicated that they were looking for 
a job. Among these, 72.7% were “very interested” in a job at a research university and 
23.0% were “somewhat interested.” 
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inherently biased towards those preferring such employment, as the above 
statistics indicate, the odds that the respondents will achieve a tenure-track 
position are not good. 

The academic labor market in the United States has been characterized by 
Stephan and Levin (2002) as building upon a series of implicit contracts. 
Graduate students and postdocs enter a program and provide some “surplus” 
for the lab through their work as a research assistant or postdoc, and then 
leave the institution to begin a research career. The professor has an incentive 
to not cheat on the arrangement. If the student is kept too long, or educated 
too poorly to be considered employable by a future dean, or provided poor 
information concerning job outcomes, in theory the professor will cease to be 
able to attract top graduate students and the source of labor, compensated 
well below its opportunity cost, will dry up. 

This system, which loosely resembles a pyramid scheme, works reasonably 
well as long as there is a growing demand for faculty positions. But for this to 
occur, funding for science must not only grow, but must grow sufficiently fast 
to absorb the growing workforce of scientists. Such a tremendous growth in 
resources is something that the U.S. system has been unable to provide, par- 
ticularly in recent years. 

But still the system survives and young scientists continue to be recruited 
into PhD programs. Stephan and Levine (2002) argue that three factors have 
allowed it to persist: (1) the demand for college education by the baby 
boomers in the 1960s and 1970s, which provided fuel for the system to expand; 
(2) the concept of “postdoctoral study” and (3) the eagerness of foreign 
nationals to study in the U.S. While the first factor is no longer relevant, the 
second and third are. The postdoctoral position provides relief for the system 
in several ways. First, by providing employment opportunities for newly 
minted PhDs it provides professors an “out” by allowing them to place their 
students more easily. Second, recipients realize that the postdoctoral position 
enhances their research record and thus permits them to signal their research 
capabilities. Finally, and perhaps unwittingly, it diffuses the role that place- 
ment plays in recruiting students to study. If applicants to graduate school 
inquire about job placements in academe, they can be told that academe no 
longer recruits faculty directly from PhD programs, but instead, only considers 
applicants with postdoctoral experience. The professor is, so to speak, “off the 
hook”. The large presence of foreign nationals diffuses even more the role 
that placement plays. Rarely do foreign nationals applying to graduate school 
inquire about job prospects. In an international context, their prospects are 
significantly higher as a result of studying in the U.S. than they would be if 
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they were not to study in the U.S. Thus, many of the self-correcting mecha- 
nisms that might otherwise result have failed to take place." 


2.2 The situation in Italy 


Public sector research in Italy occurs in the university sector and at public 
research institutions (PRIs). Within the PRI sector, the National Research 
Council (CNR) employs approximately 80% of all PRI researchers." Tenured 
positions at universities exist at three levels: researcher, associate professor 
and full professor. Universities also employ contract researchers as temporary 
employees. Researchers at CNR are hired either into temporary contract 
positions or into tenured positions (Ricercatore or Primo Ricercatore). 


Figure 5 Age of Tenured Academics in 2004: Italy 
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Source: MUIR (Ministry of Italy for University and Research): 
http://www.miur.it/scripts/visione_docenti/vdocenti0.asp 


8 U.S. students, as opposed to international students, increasingly find careers in 
science and engineering to be not to their liking. Considerable concern has been 
expressed in policy circles regarding this decline in interest. 

14 The other public research institutions in Italy are the National Institute of Nuclear 
Physics (INFN) and the National Institute of Heath (ISS). 
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Figure 6 Age Distribution of CNR Tenured Researchers in 2004: Italy 
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The job prospects of young PhDs within the university sector have been 
bleak in recent years and in 2003 a “no new permanent position” policy went 
into effect. This has resulted in a situation in which the share of temporary 
researchers at universities has reached 50% in some instances, with young 
people being heavily concentrated in temporary positions (Avveduto 2005). 
Figure 5 shows the age distribution for faculty holding tenured positions at 
Italian universities in 2004. The average age of researchers is 45; those in 
associate professor positions is 51.7 and those in full professor positions is 58. 
What is not shown, but worth noting, is that the average age of researchers has 
increased by more than two years during the seven-year interval from 1997 to 
2004. 

The situation is no better within the CNR, where a “no new permanent 
position” went into effect in 2002. The high number of retirements coupled 
with the hiring freeze has led to a disproportionate number of young scientists 
in temporary positions; the share of temporary researchers has grown to over 
50% and the average age of the CNR researcher is now above 47. Figure 6 
shows the age distribution for CNR researchers in tenured positions. The 
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average for those in the position of Ricercatore is 42; for those in the position 
of Primo Ricercatore it is 55." 

One response to the poor job prospects for young PhDs in Italy has been for 
young scientists to leave the country to find employment. A 2002 CENSIS 
survey of 1996 Italian researchers working abroad found that the common 
reason for leaving Italy is lack of access to and progression in a career in the 
Italian scientific environment. 


2.3 The situation in Germany 


The article by Schulze (2008) in this book points to the softness of the aca- 
demic labor market in Germany. For example, figure 1 of his chapter shows 
that the number of professors at German universities peaked in 1993 at about 
23,000 and has been, with few exceptions, steadily declining ever since. In 
2004, the last year for which he reports data, the number stood at just slightly 
over 21,000. The decline is not due to a decline in the number of students. The 
author shows that during the same period the number of high school gradu- 
ates increased significantly. He calculates that the ratio of professors per 100 
high school graduates “has deteriorated significantly from 11.26 in 1996 to 
9.43 in 2004” (section 3.2). 

The decline has come at the same time that the number of Habilitationen, a 
requirement for obtaining an appointment as a professor at most institutions 
and in most fields, has grown dramatically.!° To wit, since 1992, when 
approximately 1300 Habilitationen were produced annually, the number had 
grown by 2004 to approximately 2200 per year. In terms of Habilitationen per 
100 professors, there has been more than a 66% increase during the period.” 
Using a back of the envelope type of calculation, Schulze (2008) estimates that 
the ratio of new applications to job openings rose from roughly 3/2 to 5/2 
during the 14-year period that he analyzes. 

It is not only that the job prospects for individuals who have recently 
received their Habilitationen are poor at German universities. It is also the 
case that, if and when they do receive a permanent position and the research 


15 The average age of tenured new hires at CNR has increased from 30 to 35 since the 
late 1980s; the average age of non-tenured new hires is 33.6. 

16 The typical academic career path in Germany involves preparing the Habilitation. 
After completion, and pending availability of a position, one is hired into a C3 or C4 
(now W2 or W3) position which must be at an institution other than where the Habil- 
itation was prepared. 

17 The situation is reminiscent of that in the U.S. with post docs. While the number of 
tenure-track faculty positions has grown minimally during the last ten to fifteen years, 
the ratio of postdocs to faculty has grown dramatically (see footnote 10). The incentive to 
recruit individuals to prepare the Habilitation is similar to the incentive to recruit 
graduates to hold a post doc position. Both are cheap and productive. 
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autonomy that comes with a permanent position, they are around 42 years of 
age (Mayer 2000). Musselin (2005), in her comparison of French, U.S. and 
German academic career paths, notes that among the three countries studied 
the age of obtaining a permanent, tenured, position is oldest in Germany. 
Moreover, the opportunity to be autonomous has not been possible for young 
scientists in Germany, since independent untenured positions have not existed 
for young scientists. 

Recently Germany has instituted reforms that could have a significant 
effect on the academic labor market. Specifically, while heretofore individuals 
could generally not be appointed to a professorial post until they had obtained 
the Habilitation,'® the reforms mean, depending upon the state, that the 
Habilitation could disappear and the post of junior assistant professor would 
then be accessible directly after the doctorate. Contracts for the junior pro- 
fessor are for three years and renewable one time.” In certain ways, this 
system resembles that of the United States. However, it will not necessarily 
follow that being hired into a junior position (and renewed) provides for 
entrée into the position of professor. This will depend not only upon the 
quality of one’s work (as in the U.S.) but also upon availability of posts at the 
professor level. While positions can be cut in the United States, it is uncom- 
mon for an untenured faculty member who merits promotion to be denied 
tenure and promotion because the position no longer exists. Rather, the 
position will persist and can be changed from that of an assistant to that of an 
associate or full over the course of the scientist’s career. 

A second reform measure involves a move from the “C” to the “W” system. 
Although the reform was ostensibly designed to provide for performance- 
based salary increases, it arguably may not succeed in accomplishing this goal. 
A major component of the change is the way in which base salaries are 
negotiated. Under the C system, faculty having a competing job offer could 
negotiate a higher salary at their home institution. The resulting raise was 
permanent and included in the base used for the computation of pensions. 
Under the W system, the base salary has been lowered with the idea that 
performance-based supplements would be possible. The supplements are in 
principle for a limited period of time. Only if they have been granted for five 
or more years do they become permanent, although the latter is subject to 
negotiation. 

The W system has the potential of reducing mobility and penalizing pro- 
ductive faculty since for C4 professors it is almost impossible to obtain a 
competitive W3 job offer. Moreover, not only is the W salary lower, but by 


'8 There are exceptions to the Habilitation requirement. For example, one could 
submit equivalent academic achievements, such as publications, and in technical uni- 
versities many professors do not have a Habilitation. 

1 In certain cases junior professors can be tenured if they change universities after 
completing the Ph.D. 
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switching to a W position, the professor gives up the moderate increases in 
salary that accompany the C position. Thus, it is likely that the switch will 
make employment at German universities less attractive for productive aca- 
demics and increase the incentives to go abroad. 


2.4 The situation elsewhere 


This situation is not unique to Italy, Germany, and the United States. In 
France, for example, restrictions have led to poor job prospects for scientific 
employment in the public sector, which makes up half of R&D employment 
(European Commission 2004, 34). The number of contract researchers dou- 
bled during the 1990s in the United Kingdom. Most European countries are 
also experiencing a brain drain. By way of example, 75% of the 15,158 
Europeans who received their PhD in the U.S. between 1991 and 2000 indi- 
cated that they preferred to stay in the U.S. after the PhD to establish their 
career. About 50% indicated that they had a firm offer of employment 
(Science and Technology Indicators 2003, chapter 3). 

To summarize, young scientists today in many western countries have dif- 
ficulty getting the type of research position — one that provides for autonomy 
and a sufficient time horizon - that they anticipated getting when they began 
their studies. They end up working for long periods in a postdoctorate fel- 
lowship or in temporary positions as staff scientist or contract researcher. If 
and when they do get a position that provides for autonomy they are older. 

This situation has negative effects on scientific productivity. First, and 
foremost, is the loss in productivity of what the young could have discovered if 
they had had increased autonomy and a longer horizon. A second effect is the 
loss in terms of the negative signal such outcomes send to younger people that 
science may not be a choice career. To quote Michael Teitelbaum of the 
Alfred P. Sloan Foundation (unpublished 2005), “Bad job prospects reinforce 
lack of interest”. The preface to “Bridges to Independence” makes the case by 
imagining the year 2029 and a NAS committee assigned to trace the root 
causes of the U.S.’s fall from preeminence in biomedical sciences. “It was not 
difficult for the NAS Committee in 2029 to trace the root causes of the U.S. 
fall from preeminence in biomedical sciences. American college students had 
always paid close attention to what their peers had to say: The stories of a 
decade-long post-baccalaureate training period characterized by long hours 
and low pay were discouraging enough, but when coupled with the slim 
chance of advancing to an independent research position before the age of 40, 
few of the most talented American students were enticed” (National Acad- 
emy of Science 2005, vii-viii). The European Economic and Social Com- 
mittee observed with regards to the document “Towards a European 
Research Area”: “One reason for the current lack of new recruits in science 


Job Market Effects on Scientific Productivity 25 


and technology is that a few years ago a very large number of young scientists 
— even those with excellent qualifications - were unemployed” (European 
Economic and Social Committee CES 595/2000, 15).? 


3 Shortage 


Despite these facts, it is common for policy groups on both sides of the 
Atlantic to declare an impending shortage of scientists and engineers. A 2003 
report issued by the National Science Board (2003) concluded that “Analyses 
of current trends (in U.S. science and engineering workforce) indicate serious 
problems lie ahead that may threaten our long-term prosperity and national 
security.” A 2003 European Commission Communication, “Investing in 
research: an action plan for Europe” concluded that “Increased investment in 
research will raise the demand for researchers: about 1.2 million additional 
research personnel, including 700,000 additional researchers, are deemed 
necessary to attain the objectives, on top of the expected replacement of the 
aging workforce in research.” 

Predictions of shortages exacerbate the problem. Encouraging individuals 
to enter a career when prospects are poor can have serious longer term 
consequences. Moreover, such forecasts diminish the credibility of the 
organization declaring the shortage, as the National Science Foundation 
learned all too painfully in the 1980s. 


4 Positions in industry 


In recent years the employment of scientists and engineers in industry has 
grown rapidly in the United States, as indicated in Figure 7 (see p. 26). In 
chemistry and engineering more than 50% of all PhDs work in industry and 
have for a considerable period. Although the percent is considerably lower in 
math/computer science and the life sciences, it has grown rapidly in recent 
years, tripling in the case of math and computer science and doubling in the 
case of the life sciences. Moreover, it would be incorrect to think of these jobs 
as only concentrated in development work. A considerable amount of funda- 
mental research is performed in industry in the United States. One manifes- 
tation of this is that industry authors were listed on approximately 10% of all 
scientific articles published in the U.S. in 2001 (National Science Board 2004, 
table 5-40). Many of these articles are coauthored with colleagues in academe. 

Employment in industry is a less salient option for European scientists. This 
is partly due to the lower rate of spending on R&D in Europe. For example, 


20 Referenced European Commission (2004, 34). 
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Figure 7 Percent of U.S. PhDs Working in Industry, by Field, 1973 — 1999* 


Percentage of PhDs 


N 


SUOLLTLOTLILTSESTTIPSSELLLTSTLESLTT LSD 


All S&E Life Sciences Engineering Chemistry Physics & Math & Computer 
Astronomy Sciences 


01973 81975 01977 81979 @1981 01983 81985 01967 ©1989 1991 01993 01995 @1997 m1999 


* For those five or more years since receipt of PhD and 65 or younger. 
Source: SDR tabulations (see text). 


on average the EU spends approximately 2% of GDP on R&D; 55% of this is 
performed in industry. By way of contrast, the U.S. spends 2.9% of GDP on 
R&D; 64% is performed in industry. Japan spends 3.0% on R&D, 74% is 
performed in industry. Moreover, the prospects for employment growth in 
industrial R&D in the EU are not encouraging. The consequences relating to 
the privatization of research labs of state industries is a case in point. Case 
studies of labs in Italy and France that have recently been privatized suggest 
that privatization has shifted the research focus of these labs away from the 
generation of new knowledge in the national interest to creating value for the 
company and its clients “by emphasizing the assessment and integration of 
external knowledge” (Munari 2002). Outsourcing of research is also an issue 
but the outsourcing is not solely directed towards Asia and countries that have 
a “cost advantage”. Table 1 presents data on R&D expenditures of European 
majority-owned affiliates operating in the United States (Bureau of Economic 
Analysis data). We see that over a short span of five years the amount spent 
by Europe (current dollars) has grown by more than 67 percent and over the 
10 year period by 150 percent. A good example of the trend is the recent 
decision of Novartis to relocate its research headquarters to Cambridge, 
Massachusetts, in order to take advantage of the research synergies in the 
vicinity of MIT and Harvard universities. When it opens, Novartis will employ 
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Tablel R&D Expenditures of Majority Owned European Affiliates in United 
States (Billions U.S. dollars) 


1992 1997 2002 
Germany 1.8 2.9 5.7 
U.K. 2.1 3.0 5:5 
Other 4.4 6.4 9.5 
Total 8.3 12.3 20.7 


400 research scientists; its plans call for it to hire an additional 1000 
researchers in the next five years. 


5 Conclusion 


Young scientists today have difficulty getting the research positions they 
anticipated at the time they began their training. Many end up holding 
postdoctorate positions for long periods or as staff scientists, contract 
researchers or adjunct faculty. When they do get a permanent position, they 
start out at a considerably older age than did their mentors. 

There is much angst in western countries today concerning the prospects for 
economic growth. The role of scientific productivity in economic growth is 
widely appreciated. From time to time this angst focuses on problems of the 
supply of scientists, with the argument that economic growth will be jeopar- 
dized if supply fails to keep pace with projected demand. Here we have 
argued that the problem is not a lack of supply. Instead it is weakness in 
demand. Decreasing budgets and increasing relative costs have led the public 
sector to hire fewer scientists — especially into permanent positions. Industry, 
especially in Europe, has been slow to hire scientists and engineers. The future 
of science is its ability to attract new generations of scientists and to employ 
them in a research environment that fosters creativity. Unless fundamental 
problems giving rise to these employment issues are addressed, we risk the 
possibility of seriously diminishing scientific productivity in the West. 

This risk is occurring in the context of growing competition in an increas- 
ingly global economy. Non-western nations are aggressively training and hir- 
ing scientists and engineers. The number of PhDs awarded in China, for 
example, increased more than five-fold between 1995-2005 (French 2005); 
that in India and Korea has also grown dramatically. The ability of a country 
to innovate and grow relates in part to having a scientific workforce that is 
generating new ideas. Both Europe and the U.S. are educating large numbers 
of PhDs. Some of these are “native.” Others come as foreign students. Unless 
Europe and the U.S. provide work environments in which these scientists and 
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engineers can flourish and be productive, they risk losing the scientific edge 
from which they have historically profited. The public sector needs to examine 
ways to enhance the hiring of scientists and engineers into positions that 
provide a productive work environment.”! Temporary, piecemeal jobs, which 
have become increasingly the norm in many countries, are not the solution. 
Research requires a sufficient time horizon and a degree of autonomy. 
Countries seeking to enhance productivity need to provide such opportunities 
for scientists when they are young. Age may not be a fever chill, but prize- 
winning work is rarely begun when scientists are past the age of 40. 
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Comment by 


BERND FITZENBERGER* 


Stephan analyzes the interaction between job market prospects and scientific 
productivity in the sciences.! She argues that dismal job prospects (will) 
reduce considerably entry of highly talented young researcher into an aca- 
demic career. Stephan focusses on the US and she discusses some develop- 
ments in Italy and Germany. 

Being a labor economist, I think this is a very interesting and needed study 
because it discusses the important relationship between scientific progress and 
individual job prospects of the researchers. Scientific progress cannot be 
produced without suitable incentives (career prospects) for the researchers. 
This is particularly critical for basic research (nobel prize winning research is 
only the tip of the iceberg) typically not involving immediate commercial 
returns. 


1 Critical Assessment of Analysis for US 


Stephan argues that job prospects for PhDs in the sciences have deteriorated 
tremendously over the recent decade. The implicit contract between PhD and 
full professors/research universities, involving remuneration of a successful, 
hard working PhD/assistant professor by eventual tenure in an academic 
(university) job has not paid off for an increasing share of the PhDs. The 
increasing supply of completed PhDs in the US has in fact resulted in uni- 
versities hiring more cheaper postdocs and less more expensive assistant 
professors on tenure-track positions. These changes threaten the viability of 
the implicit contract and, in response, Stephan predicts a severe decline in the 
willingness to do a demanding PhD in the future. 

Clearly, at face value, this argument relies on irrational behavior of the 
recent cohorts of PhDs because their expectations regarding the implicit 
contract have not been realized on average. Stephan argues that such irra- 
tional beliefs could have been reinforced by a culture of gift exchange where 
post-docs could be lured into believing for a while that they will eventually 
would get a tenure-track position. Only with delay these postdocs would 


* I am grateful for helpful comments by Dominique Demougin, Martin Kolmar, and 
other conference participants. I thank Marie Waller for excellent research assistance. All 
errors are my sole responsibility. 

' Here, I talk about the sciences when Stephan refers to physical, life, and mathe- 
matical sciences including engineering. 
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realize that this expectation will not materialize. As soon as students in the 
sciences fully realize that these promises are broken, the supply of PhDs will 
decline considerably. 

I am inclined to investigate potential explanations not relying on irrational 
behavior. For a rational explanation, the difficulty is to explain the strong 
increase in supply of new PhDs despite the deterioration of the tenure 
prospects. I will investigate the following two arguments: (1) Increasing supply 
of foreigners obtaining PhDs in the US. (2) Good job prospects in industry for 
PhDs. 

Both arguments are also discussed by Stephan, though under a different 
perspective. The first argument is based on the presumption that foreign 
students still find graduate education in the US very attractive, even if chances 
for tenure-track positions have deterioated. Foreign students often prefer 
staying in the US after completion of their PhD because of better job pros- 
pects in the US outside of academia compared to job prospects in their home 
countries. Postdoc positions are a simple way to extend the stay in the US and 
to find an attractive job. The huge supply of foreign graduate students is likely 
to fuel basic research in the US by filling the labs with highly educated and 
motivated postdocs, unless of course incentives to engage in basic research 
change themselves as Stephan indicates. 

Turning to the second argument, even in the early 1990s almost 90% of 
biomedical PhDs could not get a tenure-track job (Figure 1 in Stephan’s 
paper). Thus, the majority of PhDs must eventually end up in industry jobs 
which are likely to be quite attractive because these jobs often combine 
applied academic research with high salaries.” This is confirmed by the dis- 
cussion in section 4 of Stephan’s paper. It seems plausible that a large number 
of biomedical PhDs in the US saw only small chances to end up in a tenure- 
track position. Instead, they view obtaining a PhD and working in a low-paid 
postdoc position mainly as an investment for their eventual career in industry. 

In the face of an increasing supply of PhDs, it is a rational response of 
universities to change hiring policies such that young researcher obtain more 
tempory positions with lower salaries. These changes increase uncertainty 
among young researchers, which Stephan argues to lower research pro- 
ductivity. This trend is associated with a shift away from basic to more applied 
(commercialized) research. 

A major problem arises nevertheless for the US, as Stephan emphasizes, if 
excellency and creativity (prize winning research) in basic research require 


? It is straight forward to develop an economic model of the decision to obtain a PhD 
where the degree involves two career alternatives: First, PhDs are eligible to apply for a 
tenure-track position in academia. Second, they might obtain a well paid research job in 
industry. Ceteris paribus, an increasing number of PhDs can be explained rationally by 
the second alternative becoming more attractive, even if the first alternative loses in 
option value. 
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independence of the researcher at an age below 40. In addition, increasing 
competition for tenure-track position might have an ambiguous effect on the 
research effort of young researchers. On one hand, one might speculate that a 
more competitive environment might increase incentives to do excellent 
research in order to get one of the rare tenure-track positions. This way the 
total research output increases when competition for tenure-track positions 
increases. On the other hand, the return to research effort declines with the 
number of competitors, as standard tournament theory suggests, because 
chances to obtain a tenure-track position (that is, the prize in the tournament) 
declines at a given level of research effort. In light of the declining returns to 
the tournament for the tenure-track positions, the share of PhD students 
engaged in this tournament declines and more of them will focus on applied 
research enhancing their chances for a well paid position in the industry. 

Summing up, the increasing supply of PhDs in the sciences in the US by 
itself might not reflect irrational behavior but rather the immigration of 
excellent young researchers to the US and the good job prospects of PhDs in 
industry. It is not clear that these two effects are going to lose importance in 
the near future. Thus the only concern might be that young US citizens enroll 
to a lesser extent in PhD programs. However, the effect of the increasing 
supply of PhDs on total output in basic research (what is the research pro- 
duction function?) is ambiguous. A related open question is whether the top 
PhDs still strive and obtain the tenure-track positions allowing them to do 
basic research. Thus, I am not convinced that the amount of basic research will 
decline dramatically. 


2 The situation in Italy and Germany 


I think that the situation in Italy and in Germany is very different from the US 
and therefore, Italy and Germany can not be used as further examples for the 
arguments put forward for the US. According to Stephan, Italy has turned into 
a closed shop with basically no (!) hiring of researchers into tenure-track 
positions in the sciences. Here, the job propects are clearly so bad that 
excellent Italian researchers tend to leave the country (e.g. for the US). 

The remainder of this section focusses on the situation in Germany. Stephan 
addresses first the fact that the ratio between habilitations* and the number of 
professorships increased considerably between 1992 and 2004 from roughly 
3/2 to 5/2 (referring to the numbers in the paper by Schulze and Warning in 
this volume). However, one should be aware that this might be a cohort effect 
because a disproportionately large number of older professors are due to 


3 Traditionally, completing a habilitation was a formal requirement to be considered 
for a tenured professorship. 
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retire between 2000 and 2010. Nevertheless, it is likely that prospects to obtain 
a tenured position have deteriorated over the last 15 years. In the early 1990s, 
many professorships had to be refilled in East Germany. Nowadays, budgets 
are very tight and a number of professorships are cut or will be cut by the 
government. 

I have run a small exploratory survey about the job prospects among six 
young researchers in economics and sociology in Germany (for the sake of 
brevity, I can not report the detailed results here - it goes without saying, that 
six responses are not sufficient for statistically valid results). 

The following answer in the survey: 

What I find worrisome is the current “overproduction” of young researchers 
due to the promotion of graduate programs and also of post-docs. Combined 
with a probable decrease in tenured positions and an increased net import of 
researchers, this may force many of my generation to drop out of academia ... 
confirms at first glance Stephan’s point that job prospects have deterioated in 
Germany in a similar way as in the US. There is, however, a major difference 
between Germany and the US. In Germany, the average age after completion 
of the habilitation is above 40. At this age, it is much more difficult to start an 
alternative career in industry compared to a postdoc a couple of years after 
completion of the PhD in the US. 

In principle, the introduction of the junior professur without the require- 
ment of a habilitation as the equivalent of the assistant professur should lead 
to more independence of young researchers. The time limit imposed by the 
German government should lead to earlier transitions to tenured positions. 
However, in contrast to the US system, junior professors typically do not have 
a tenure-track position. 

The change of the salary system from the C-system to the W-system 
involves a considerable decline of the base salary and flexible increases of the 
salary based on performance. However, upward salary flexibility is severely 
limited by tight budgets, in fact rendering the new pay system less attractive, 
especially for those who start their academic career under the new system. 

In the short run, again as a cohort effect, the introduction of the W-system 
might improve job prospects of young researchers as indicated by the fol- 
lowing answer in my survey: 

Due to the changes in the salary system, the competition from tenured pro- 
fessors from inside Germany is reduced. 

This is because established professors find it less attractive to change jobs 
under the new W-system. 

Overall, as Stephan concludes, an academic career in Germany is likely to 
become less attractive because of the decline in salaries. The positive incentive 
effects of the new W-system can only work if universities have sufficient re- 
sources to honor performance and if the junior professor becomes a true 
tenure-track position. 


Tertiary Education in a Federal System: 
The Case of Germany* 


by 
GÜNTHER G. SCHULZE 


1 Introduction 


In Germany, the responsibility for education rests with the German states, the 
Länder. While the primary and secondary education is largely a regional issue 
— pupils and teachers do not move across state boundaries in order to exploit 
differences in educational quality — this is clearly not so for tertiary education. 
For many fields students are free to move to universities outside the state in 
which they received their high school diploma. Likewise Ph.D. students seek 
jobs at universities that best meet their intellectual interests and are conducive 
for furthering their career. High mobility of high-school graduates, students, 
university graduates, and doctoral students across states in the presence of 
decentralized and almost free service provision raises important policy issues. 
A decentralized education system may well lead to externalities and ineffi- 
ciencies resulting in suboptimal educational investments by the states, which 
may either be too low or too high. 

Students may receive a free education in one state and subsequently move 
to a different state where they find employment, pay taxes and increase the 
local GDP thereby free-riding on educational services of the educating state. 
If this were a random phenomenon, these in kind transfers between states 
would cancel out. However, if students react systematically to differences in 
educational capacities (which may translate into different quality levels) in 
their choice of university there may be an incentive for states to free-ride on 
the educational services provided by other states. This incentive could lead to 
an underprovision of public education provided that the decision where to 
locate after graduating from university was independent from the decision 
where to study. This assumption, however, is a stark one. There is evidence 
that universities produce regional spillovers which create employment in the 
region and raise regional human capital and GDP (Stephan 1996). 


* ] am very grateful to Susanne Warning for her support and to conference partic- 
ipants and especially my discussant Stefan Voigt for helpful comments. Moreover I am 
very much indebted to Hanna Rotarius and Friederike Hiilsmann for excellent research 
assistance and to Max Albert, Juliane Fliedner, Heinrich Ursprung, and members of the 
council of the economics of education (Bildungsökonomischer Ausschuß) of the German 
Economic Association (Verein fiir Socialpolitik) for helpful comments on an earlier 
draft. 
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If these regional spillovers provide an incentive for students to stay in the 
region after completion of their studies, states may have an incentive to attract 
the most brilliant minds in order to enhance their pool of high-skilled workers. 
States find themselves in a different situation of strategic interaction that may 
result in an overinvestment in educational capacities as educational quality is 
the instrument to attract out of state high school graduates. In that case the 
attracting states free ride on the primary and secondary education provided by 
the home states of migrating students.! 

Thus the question suggests itself to what extent a federal system of free 
tertiary education such as the German one gives rise to external effects and 
strategic behavior either through competitive overinvestment or free riding 
behavior. This is the concern of this paper. 

I model state governments’ decision to provide tertiary education in a 
simple model of endogenous human capital formation. I consider three factors 
of production: Labor is interregionally immobile and inelasticly supplied. 
Human capital is produced through the education system and thus a con- 
sequence of a political decision. It is mobile across state boundaries but not 
internationally. This dichotomy of mobility reflects in a simple way the well 
established notion that mobility increases with educational attainment (e.g., 
Greenwood 1997, Chiswick 2000, Hunt 2000). Mobility occurs at two stages — 
after high school individuals decide where to study and after university 
graduation they decide where to work. At both stages there is some inertia — 
students have a preference to study near home but base their decision also on 
relative educational qualities, and university graduates’ probability to stay in 
the state where they graduated is larger than the share of employment that 
this state provides. Lastly, capital is mobile internationally and thus the 
domestically installed capital stock depends on the amount of labor and 
human capital in the state. This reflects the observation that each German 
state is a small open economy with an endogenous capital stock and that the 
availability of labor and human capital is an important location factor for 
investment (Burgess and Venables 2004 for a survey). Thus the decision on 
human capital formation impinges upon interregional capital allocation as 
well. 

In the empirical section of the paper I look for indications of externalities. 
In particular, I seek to establish whether there are significant differences 
across states in the level of educational quality as measured by the number of 
professors per 100 high school graduates and analyze to what extent these 
differences provide incentives to students to move to a state with better 
educational capacities. 


! Likewise states may seek to free ride on the educational services within the uni- 
versity system: They could hire new professors and new PhDs that had been trained 
elsewhere on a net basis thus saving on education expenses. 
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The analysis of education provision in the presence of mobility of high- 
skilled labor dates back to Grubel and Scott (1966) and Bhagwati and 
Hamada (1974) who analyzed the brain drain from developing to developed 
countries. Justman and Thisse (1997) show that governments will under- 
provide education (financed by some immobile factor) if high skilled labor is 
mobile. Wildasin (2000) shows that the immobile factors have to bear the costs 
of public education if educated labor becomes mobile, implying a regressive 
tax system. Stidekum (2005) shows in a core-periphery model that educational 
subsidies in the periphery can miss their regional policy target if increased 
education leads to higher mobility and therefore stronger migration to the 
center. Poutvaara and Kanniainen (2000) demonstrate that in the presence of 
educational spillovers and complementarities between low and high skilled 
labor, low skilled labor voluntarily subsidizes education. This result, however, 
breaks down if high skilled labor becomes mobile and moves across state 
boundaries. It evades high taxes that finance education by moving abroad; 
similarly immobile uneducated labor seeks to free ride on educational efforts 
of other states and thus avoid taxes — the public education system breaks 
down. Poutvaara (2000) analyzes a combination of educational subsidies in 
the first period with taxation to finance them in the second period and shows 
that this scheme serves as insurance device against uncertainty in educational 
productivity. Interregional mobility may ensure against region-specific shocks 
and thus increase education, tax competition leads to erosion of taxes — 
welfare effects can go in either direction. All of these papers allow for 
mobility only after education has been completed; in most papers mobility is 
assumed to be perfect with the only motivation for mobility being differences 
in net returns. In contrast, I do not assume mobility to be perfect in the above 
sense, but governed by other considerations as well and I allow mobility of 
students as well as of high school graduates. Biittner and Schwager (2004) 
model mobility of high school graduates in Germany’s federal system and 
assume local governments care only about the well-being of their high school 
graduates (wherever they study or work), some research spillover from uni- 
versities and the costs of their universities. They have thus an incentive to free 
ride on the education quality provided by neighboring states as their high 
school graduates can study there. As a result, investment in universities is 
suboptimally low. Büttner and Schwager disregard the effect of universities on 
the regional economy (Stephan 1996) and they assume high skilled labor to be 
perfectly mobile; thus by assumption a state cannot profit from attracting 
students. Contrastingly, I model these effects in a three factor stationary state 
model and show that attracting students increases the remuneration of the 
immobile factor, attracts capital and raises regional GDP. 

The paper is structured as follows: section 2 presents theoretical consid- 
erations on the provision of educational services in a federal system. Section 3 
provides empirical evidence on the provision of professors in Germany — on 
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the federal level as well as on state level and demonstrates that differences in 
educational capacities influence students’ migration flows across state boun- 
daries. Section 4 provides some concluding remarks. 


2 Theoretical Considerations 


In a federal system the provision of education may give rise to external effects 
at various levels as states may import educational services from other member 
states on a net basis. If states attract high school graduates from other states 
(for instance by providing a better university infrastructure and quality) and 
the attracted people remain in their new states after graduating from uni- 
versity these states effectively free-ride on primary and secondary education 
provided by other states. If however university graduates locate in a different 
state after completing their education, the receiving state free rides on tertiary 
education provided by another state (and possibly also on primary and sec- 
ondary educational services). In other words, interstate mobility of students 
and graduates produce externalities if education is subsidized or even free and 
thus gives rise to potentially severe inefficiencies. 

This situation is described by strategic interaction of states seeking to 
attract human capital without fully paying for its production at the expense of 
other states. They do so by providing a high quality university system that 
promises a good education with high returns. I model the states’ calculus to 
provide tertiary education in the presence of (limited) mobility of students in 
a game-theoretic model of two jurisdictions in the steady state, which takes 
into account that capital is mobile international and that human capital will 
not only attract foreign direct investment. Thereby I am able to model 
international repercussions of competitive human capital formation in a fed- 
eral system. 

The model proceeds in three steps: In the next subsection optimal human 
capital formation in a small open economy is derived; the second subsection is 
devoted to modeling the strategic interaction of small open federal states in 
providing tertiary education and lastly the properties of the ensuing Nash- 
equilibrium are described. 


2.1 Optimal Human Capital Formation in a Small Open Economy 


Assume a small open economy that produces with the help of physical and 
human capital and labor, K, H, L. The neoclassical production function 
exhibits constant returns to scale and is described by Y! = F(K,H,L). Y° 
denotes the gross domestic product. First partial derivatives are positive, 
second partial derivatives are negative and cross derivatives are positive (e.g., 
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Fey > 0, Fx, > 0 etc.). Inada conditions are assumed to hold. Capital is 
mobile internationally and therefore earns the given world market rate of 
return 7” , which implicitly determines the amount of capital installed in the 
economy (K*), given its human capital and labor endowment: 


(1) r" = Fę(K*(H, L), H, L) 


Implicit in eq. (1) is the notion that a broad human capital and labor base will 
attract physical capital as it increases its marginal return — K depends on H 
and L. Labor is assumed to be immobile. For now I treat human capital as 
immobile across boundaries, but will relax this assumption in the next sub- 
section. Gross national product is given by: 


2) Y = F(K*,H,L) +”(K5 — K*), 


where K5 denotes the physical capital owned by the society and K* denotes 
the physical capital installed in the economy. Thus (K$ — K*) denotes the net 
capital export. 

Optimal human capital formation is analyzed in the steady state because 
educational investment is long-term by nature. For simplicity I assume labor 
to be stationary. The steady state condition for capital accumulation requires 
that savings equal depreciation of the capital stock, i.e., s [F(K*,H,L) 
+r” (KS — K*)] = ô KS . This determines K°:? 


SZ S * WERK * 
3) KS =< [F(K*,H,L) — r"K*| 


K° may be larger or smaller than K*.? Unlike physical capital, which is 
accumulated through (private) saving and investment dynamics, human cap- 
ital is produced through the public education system. Private schools and 
universities, although they exist, play a very minor role in Germany. Thus 
human capital accumulation is the result of a political decision. From eq. (2) 
the effect of increased human capital on steady state national income is 
derived as: 


oY OF OK* OF es on) OF „ƏKS 


4 = = N 
oon = IK oH oA \oH 0H dH OH’ 
nl eee! 


OGDP 
OH 


? Note that 6 > sr" needs to hold, otherwise a non-degenerate steady state does not 
exist. Realistic parameter constellations always satisfy this condition. 

3 Whether the economy exports or imports capital depends on its saving rate relative 
to the world saving rate, its production technology and depreciation rate relative to the 
world. This is elaborated in detail in appendix 5.1. 
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where I have made use of the fact that 0F/OK* = r”. Differentiating (3) w.r.t. 


KS s OF 
3H Sa JH > 0. Thus (4) 


H and using the same relationship I obtain 
simplifies to 


ƏY 6 OF 


(5) ðH ö-s" 0H 2. 


The accumulation of human capital has three effects: not only does it increase 
GDP directly; it also attracts physical capital because it initially raises the 
return to capital (Fx;;>0). This portrays the importance of a skilled labor 
force as a location factor for foreign direct investment. Lastly, the increase in 
human and physical capital raises the remuneration of the immobile factor: 
labor. As a consequence GNP and the domestically owned capital stock rise — 
the society has become more affluent. 

The government, however, may not only care about pro growth policies, 
especially education. The incumbent may want to engage in redistribution, the 
provision of public goods, which may not produce growth stimuli, or special 
interest group policies (through transfers or subsidies) in order to maximize 
political support.* Spending resources for education or alternative uses 
described above and denoted by R, is constrained by the size of the budget. As 
I want to portray German federal states’ optimization calculus I take the tax 
rate T as given, because it is set by the federal government.’ Taking the 
alternative uses for the budget as composite commodity R and using it as a 
numeéraire the budget constraint reads as t Y = PuH +R. p, denotes the (rel- 
ative) price for the production of new human capital H which in the steady 
state replaces exactly the depreciated human capital 6, H (retiring skilled 
personnel, obsolete technologies etc.), i.e., Ñ = ôy H. The budget constraint 
is endogenous as it depends on H. 

Government’s optimization problem may now be formulated as 


(6) max V(Y,R) s.t. tY =p,H+R, 


where V is the objective function of the government. Optimality requires that 


7) OV/OY py, — tOY/OH 
( OV/OR OY /ƏH ; 


4 For a survey of the political economy of redistribution and public goods provision 
and special interest rate policy see Drazen (2000). 

5 Income and corporate tax rates as well as VAT rates are set at the federal level; states 
receive a share of the income taxes roughly according to their share of GNP. In other 
federal systems states have the authority to tax independently of, or in addition to the 
federal level, such as Switzerland and the US. In these systems tax rates are additional 
policy parameters for the state governments. 
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where OY /0H is given by eq. (5). The LHS of (7) gives the marginal rate of 
substitution between Y and R which needs to be equal to the marginal rate of 
transformation, the RHS of eq.(7). A marginal increase of steady state 
income through increased human capital stock reduces R by the marginally 
increased expenses for new human capital necessary to balance human capital 
depreciation (Py Ôn ), minus the increase in budget due to the growth stimulus 
of enhanced human capital investment (t OY /0H). Condition (7) determines 
implicitly the optimal value of human capital, H*. Explicit solutions can be 
derived by specifying functional forms for production and utility functions. 
For instance if I specify the utility function as 


(8) V(Y,R) = (1-0) Y+OR with 0<@<1, 


eq. (7) may be rewritten by using (5) as 


OF* ö-sr" 006,Pu 


oH ô 1-91-% 


Furthermore, if I specify the production function as 


(9) F(K,H,L) = KH’ L'-e, 


the optimal human capital stock is given by 


ö-sr" 0O0,Pu 
ô 1-e(1-7 


(10) H* = | B K*(H)“ RR 


Yet, K* is a function of H. If I use the functional form in eq. (9) I can explicitly 
derive the optimal capital stock, given H, from eq. (1):6 


(11) Po fe a pees] 


Plugging (11) into (10) and rearranging yields 


6 1-0(1-1) , payi] 
x 
(2) nn ir bô Pu B( ) | = 


This result can be summarized in 


Proposition 1: In a small open economy with internationally mobile capital 
the optimal amount of human capital is proportional to the labor force; it is 
lower, the higher the political preference for alternative uses of the budget such 


2 K* 
6 One can easily show that = > 0, 2u 


dH ane <À 
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as redistribution and special interest policies. It rises with the saving rate and 
declines with the price of human capital formation and the price of physical 
capital. 


If countries have the same production function and the same depreciation 
rates of human capital and of physical capital’ the ratio of optimal human 
capital stocks in the steady state is given by: 


H% [ö-s" (1-0, (1-1)) 0, ]”L, 


(13) H |o- sr 0-6,0-0)%| D 


This gives us 


Lemma 1: For two small open economies 1 and 2, not linked by human 
capital or labor mobility, the ratio of optimal human capitals, H¥ / H¥, rises with 
the ratio of savings rates, s, /s,, and the relative population L; / L, and declines 
with the relative preference for alternative uses of public funds 0, /0,. For equal 
savings rates and preferences the per capita human capital stock is equal across 
states, which implies that for identical linear homogenous educational tech- 
nologies the number of professors per capita is equal across states. 


Lemma 1 has been derived for small open economies assuming that there is 
no strategic interaction in the education market. Yet, in a federal system 
human capital is formed not by a unitary state, but by many member states 
which compete for high skilled labor. This situation of strategic interaction is 
produced by different degrees of factor mobility — labor is interregional 
immobile and thus its remuneration is determined by the employment of the 
other factors; capital is internationally mobile making each member state of 
the union a small open economy and a price taker in the international capital 
market. Human capital is assumed to be interregionally mobile, but interna- 
tionally immobile; its production is determined by interdependent political 
decisions of a few state governments to provide university capacities. States 
may seek to attract human capital formed by other states thereby free-riding 
on human capital investments of other member states. Thus states’ opti- 
mization calculus needs to take into account the mobility of individuals with 
high skills and those that seek a higher education. 


7 These are very reasonable assumptions for integrated markets as technology transfer 
should ensure the same — optimal — technology in both countries. Integrated factor 
markets for professors and teachers should ensure equal production costs for human 
capital. Indeed, in Germany professors’ wages have been set on the federal level by law 
(‘Hochschulrahmengesetz’) and through agreements of ministers of science and culture 
(‘Kultusministerkonferenz’). 
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2.2 Competition for Human Capital in a Federal System 


2.2.1 Students’ location choice 

High school pupils reside where their parents locate; this decision is deter- 
mined by job market and other considerations of the parents but not by dif- 
ferences in educational quality across states. Interstate mobility of people for 
educational purposes occurs only after they have graduated from high school. 
High school graduates may take up their university studies in their home state 
or in a different state, depending on the relative quality of the education 
system, and they may seek work in the state of their university education, or in 
some other state. Interstate mobility leads to externalities in provision of 
education by states and to strategic interaction in the education sector.® 

In order to understand the nature of the strategic interaction on the edu- 
cational market I need to analyze the decision of students where to study as 
well as the decision of university graduates where to seek employment. These 
decision parameters will be taken into account in optimization calculus of 
state governments which produce human capital and compete for it with other 
states. 

Since I focus on tertiary education I assume that the number of high school 
graduates (‘Abiturienten’) is given for each state, i.e., HG, = HG, Vi. Without 
loss of generality I assume that each high school graduate wants to study and 
that there is no capacity constraint on the federal level.’ I confine our analysis 
to two states. 

High school graduates of state i may either study in their home state or in 
the other state. Students have a bias for studying in their home state as this 
may reduce costs — they might still live with their parents — and preserves their 
established social contexts.!" Yet they base their decision also on the relative 
educational capacities of both states, which I proxy by the relative number of 


8 If they study and work in a state different from the state in which they received their 
primary and secondary education, the state they work in free rides on the educational 
services provided by the state in which they went to school. If they return to their home 
state after graduating from school the home state free rides on the tertiary education 
provided by the state in which they went to university. 

° In reality, not all high school graduates study at the university or technical college 
(‘Fachhochschule’). In 2000 only 78.3% of all high school graduates (‘Abiturienten’ and 
‘Fachabiturienten’) had enrolled in a university, technical college and similar institutions; 
most of whom enrolled in the same or the following year after graduation (Statistisches 
Bundesamt, Fachserie 11, Reihe 4.3.1). The analysis could be easily adjusted for a 
transfer rate smaller than one. 

10 About two-thirds of prospective students prefer to study at the university that is 
close to their parents home (Kultusministerkonferenz 2002). Therefore, there is a “home 
bias” which induces students to study in the same state where they graduated from high- 
school. 
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professors.'' The number of professors in a state is an indicator for the variety 
of subjects offered and the specialities within the subjects and thus for the 
probability to study the most preferred subject. Moreover, given the less than 
perfect mobility of students, the relative number of professors determines the 
student-professor ratio, which in turn determines the quality of teaching and 
the possibility to participate in research.'? Thus the number P, of professors is 
the strategic variable of state governments’ to attract students. 
The number of new students in state i, S; (i=1, 2) is given by 


(14) S = a HG, + (1—a,) HG, and S, = a HG, + (1—a,) HG, 


where a; < 1 denotes the share of high school residents that study in their 
home state. It is determined by the relative capacity of the state as well as the 
home bias b; > 1 of the students in that state:!3 


P P 
15 a, = b; and a= b, => 
( ) 1 1 P, $ P, 2 2 P, + P, 
If there were no home bias (i.e., b} = b, = 1), students would simply allocate 
themselves according to relative capacity regardless where they received their 


high school diploma: S, = P > (HG, + HG,). With a home bias they may 
1 2 

trade off better study conditions away from home against being close to home. 

For that reason also the number of high school graduates in a state matters for 

the number of students in that state; without home bias only the relative 


number of professors would matter. 


For simplicity, and without loss of generality, I assume that every student 
graduates from university. Upon completion of their studies a share of stu- 
dents o decides to seek a job in the state i where they graduated. The 
remainder of the graduates have no regional preference and seek jobs in state 
i according to the relative prospects of finding a job which is proxied by the 
y 


share of state i in the federal GDP: y; = Y Yi 


11 Of course, temporary staff including post docs and temporary researchers, equip- 
ment, buildings and student housing are important as well for students’ location decision. 
I assume differences in these factors follow differences in the professor-student ratio 
which I consider the most important factor determining the quality of teaching and 
research. 

12 If students were perfectly mobile in the sense that differences in the student-pro- 
fessor ratios were the only argument for moving, these ratios would be equal in equili- 
brium. In section 3.4 I provide evidence on students’ mobility being influenced by dif- 
ferences in university capacity and thus in quality. For evidence on student home bias and 
quality as determinants of migration decision see also Büttner et al. (2003). 


B Since a < 1,we assume that b is small enough not to violate the restriction that 
Pi 
b ——— <1. 
P, + P, 
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The number of new university graduates employed in state i, denoted by 
UG; is given by 


(16) UG; = pS; + y; (1 -p) (Si +S), i=12. 


This formulation portrays a situation where employment of university grad- 
uates is not completely demand determined (according to the value of y, ), but 
in which universities can have large spillovers for the regional economy in 
creating high-skilled jobs, a prime example being Stanford University’s impact 


on the ‘silicon valley’. 


2.2.2 State governments’ optimization 

Governments maximize their utility (eq. 8) subject to the budget constraint 
which is endogenous to the decision how much of the funds to allocate for 
consumption and redistributive purposes and how much to invest into human 
capital formation. Since I focus on tertiary education the relevant policy 
instrument is the number of professors that a state employs in the steady state, 
P;.'° As the market for professors is integrated and the salary schemes are the 
same for all states the annual price for a professor (including equipment, 
support staff, and researchers in the research unit), pp , is the same for all 
states. State governments’ optimization problem can be re-stated as 


(17) max Vi(Yi(Hj), Ri) st. TY(Hi) = pp Pi + R; - 


The gross national product of a state i, Y; , depends on human capital pro- 
duction as shown in section 2.1, which in turn depends on how many university 
graduates a state is able to attract (eq. 16). In the steady state UG; = ôy H,. 
The number of university graduates however depends on how many students a 
state educated — eqs. (14) and (16) — which is a function of the relative number 
of professors that a state employs (eq.15). The utility function can be 
rewritten as 


Vi(¥;, Ri) = (1 — 6) Y; + 0:R; = (1 — 6;) Y; + 0:(t Y; — pp Pi) 
= [1 9,(1 T)| Y; 6; pp Pi, 


where I have used the budget constraint. As the policy variable is the absolute 
number of professors the first order condition reads as 
OV; ƏY; OH; OUG; 
OP; ƏH; OUG; OP; 


(18) = [1 — 6,(1 — 7) 0;pr =0. 


14 This assumption reflects the availability of high skilled labor as important location 
factor for mobile (high-tech) firms. 

5 This implies that the steady state replacement need is given by ôpP, where 6p 
denotes the average replacement rate for Professors. Since the average age of obtaining 
the first tenured job is around 42 and mandatory retirement age is 65, this depreciation 
rate is about 3 percent per annum. 
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öY;/OH, is given by eq. (5), OH,/OUG, = 1/ö,, in the steady state. The term 
OUG,/öP; which describes the strategic interaction, still needs to be deter- 
mined. Plugging eqs. (14) and (15) into (16) and differentiating w.r.t. P, yields 


OUG, P, 
= p{b, HG, +b, HG} ——— +(1- 
OP, p {bı 1 2 2} (P, +P, (1—p) 
‚HG, + HG, ( OY, Y x) 
ESA > OP, "op, j` 


The terms in the last parenthesis contain the term 0UG,/0P,."° Solving (18) 
for 9UG,/OP, gives: 


P, 
b, HG, + b, HG} —— 
JUG _ p {bı 1 +b, HG2} (P, +P, AUG, 
(19) pp, \_ I=p) HG,+Hß, ( i y an OP, 
On [¥,+¥,] ? ƏH, | dH, 


2.3 Nash-Equilibrium 


Now I can derive the first order conditions. Differentiating the state’s utility 
function with respect to the professors that a state employs and setting this 
expression equal to zero yields the reaction function of that state. From eqs. 
(18) and (19) I obtain for state 1: 


P, 
ƏY, pid HG, +b, T 
1-9( 2 = 60 
N, dp) (HG, + HG.) (v, % — y, Bm =" 
(20) " [Yi + Ya] "OH, OH, 


An analogous expression is obtained for state 2. Solving (20) for P, and 
dividing the equation by the analogous equation for state 2 gives us the ratio 
of professors in both states: 


EN: 
P, M&l1-Mll-n] OY,/OH,  6,[1-0,11-1)] ô- sr” (3) 
P, 6,{1-6,(11—1)] OY,/0H, M1-9(1-m] -s 77 \ S82’ 
(21) ( 


emes ea Y; Y, 
16 The last term in parentheses can be restated as vi - Y, dY, = 
ƏY, 1 ƏY, 1\ dUG, oP, OP 
Y> JH, 5, + ı JH, 85) OP,” where I have made use of ôUG,/ôP, = 


—ðUG,/ðP; . 
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where for the second equation I have made use of the eq. (5) and the fact that 
the amount of mobile capital installed at home is given by (11). 
This gives us 


Proposition 2: Assume federal states with identical technology which produce 
with fixed labor endowments, internationally mobile physical capital (the price 
for which is given) and human capital which is produced by the member states 
and is mobile across member states. In the Nash equilibrium states will provide 
the more capacity for tertiary education relative to their competitors, measured 
by the relative number of professors, 

(I) the larger their political preference for pro-growth policies relative to 
consumptive and redistributive use of public funds is relative to that of 
other states; 

(II) the larger their saving rates are relative to those of the other member 
states; 

(III) the larger their relative population. 


While effects (i) and (ii) are linear in relative preferences and savings rates, 
effect (iii) is sub linear; that is, ceteris paribus, larger states have a worse human 
capital endowment per capita. 


Proof: 

(I) follows directly from differentiating (21) with respect to (0,/0,). Other 
things being equal, the smaller the preference for redistribution and 
consumption of public funds (i.e., the smaller 0 ) the larger the number 
of professors relative to its neighbor: Algebraically ð [(1 — #(1 — t))/@]/ 
00= -0° <0. 

(II) follows directly from differentiating (21) with respect to (s,/s,). 

(III) follows from differentiating (21) with respect to [(L,/H,)/(L,/H,)]. 
Higher relative labor to human capital endowment ratios lead to higher 
relative numbers of professors and thus to higher numbers of university 
graduates, other things being equal. However this relationship is sub- 
linear, that is a larger state (in terms of its population) will have a less 
than proportionally larger number of professors. This is seen from the 
exponent of the last term of eq. (21): 


0< see B <1. 
1-a 
The last finding starkly contrasts the result of Lemma 1, which states that 
human capital endowment is proportional to the size of the labor force. In 
other words, competition between member states for mobile human capital 
puts larger states at the receiving end. 
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3 Empirical Analysis 


This section first portrays overall trends in the provision of professors over 
time and points out differences between East and West Germany. It then 
studies the distribution of professors across the 16 German ‘Länder’ in order 
to see whether the observed pattern is consistent with the theoretical pre- 
dictions of the model. Lastly, it provides evidence for one of the model’s 
central assumptions that differences in the number of professors translate into 
migration flows of students across states. 

The tertiary education in Germany has basically three tiers: (1) the uni- 
versities, (2) technical and other colleges (‘Fachhochschulen’)"’, (3) vocational 
and technical schools (‘Fachschulen’) and universities of cooperative educa- 
tion (‘Berufsakademien’) and comparable institutions at each level. I focus on 
professors at the highest level who needed until very recently a habilitation in 
order to become a professor.'® This kind of “Super-PhD” required writing a 
book significantly more comprehensive than a PhD dissertation and passing 
an oral exam; it typically took at least as long as a normal PhD.” The custom 
was that candidates could not get their first tenured position at the university 
where they had received their habilitation. The second and third tier positions 
do not require a habilitation and are much more applied in their academic 
approach. 

I thus look at professors of all fields that are employed full time at a uni- 
versity, a technical university, or a pedagogic university.” I exclude professors 
that typically required no habilitation (or equivalent scientific output) for 
their appointment; in particular, those at technical and other colleges 
(‘Fachhochschulen’) and colleges of art (‘Kunsthochschulen’). The professors 
I look at are almost always tenured;”! I exclude assistant professors (‘Junior- 
professoren’) which were introduced only recently and are overwhelmingly 
non-tenured. All data sources are detailed in appendix 5.2.1. 


1 They often refer to themselves as “universities of applied sciences”. 

'8 Habilitation was not an indispensable requirement in order to receive a pro- 
fessorship, equivalent scientific achievements could substitute for the habilitation. 
However the vast majority of university professors held a habilitation. The only notable 
exception is professors of engineering, many of whom do not have a habilitation. 

1 Now Germany moves into the direction of the Anglo-Saxon system with non-ten- 
ured assistant professors, while the possibility of writing a habilitation and being part of a 
research team under the supervision of a tenured professor still coexists (and arguably 
still is the dominant form of preparation for the tenure decision). Recently the cumu- 
lative habilitation, a collection of papers, has become popular. 

20 Pädagogische Hochschule, which existed in Thüringen and Sachsen-Anhalt until 
1992, in Schleswig-Holstein until 1993, and still exists only in Baden-Württemberg. My 
sample includes the Catholic University Eichstätt and the two universities of the armed 
forces (“Bundeswehrhochschulen”). 

2! These are professors of the salary bracket C2, C3, C4 and after introduction of the 
new classification in 2004/5 W2, W3. Very few of these positions are non-tenured. 
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3.1 Trends in the Number of Professors and Habilitations at the Federal Level 


Overall, the number of professors shows a clear downward trend starting in 
1993, when Germany had 22,892 professors; in 2004 Germany had only 21,323 
professors. That is a reduction of seven percent in twelve years. This is shown 
in Figure 1. 


Figure 1 Professors in Germany, 1992 — 2004 
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The development in the new ‘Lander’ runs somewhat counter to the overall 
trend. Due to the reunification of Germany in 1990 and the ensuing layoff of 
East German professors hired by the GDR and the restructuring of uni- 
versities in the East, new professors needed to be hired in order to re-create 
existing universities. Thus the number of professors rose in the East and 
leveled off only in 2000 (Figure 2, see p.50). Yet the overall trend is only 
mildly affected by that as the share of Professors in the new ‘Lander’ is less 
than 20 percent.” 

Even though the restructuring of the East German universities may have 
opened up a time window of exceptional opportunities for new professors 
between 1992 and 1995, the overall trend in professorships continued to 
deteriorate for those seeking a career in academia. This, however, is not 
reflected in the number of habilitations, which continued to rise from 1311 in 


” In Figures 1 and 2 the graphs for West and East Germany both exclude Berlin as it 
was reunified as a state as well and thus has two universities of the former West Berlin 
and one of the former East Berlin. The graph for Germany includes Berlin, of course. 
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Figure 2 Professors in East Germany (without Berlin) 
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1992 to 2283 in 2004, or by 74 percent (cf. Figure 3)! The drop in habilitations 
in the East in 1990 to 1993 is due to the fact that the “Promotion B”, the GDR 
equivalent to the habilitation is included in that figure. East German scientists 
sought to finish their “Promotion B” before or shortly after they or their 


Figure 3 Habilitations in Germany, 1990-2004 
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professors were laid off. The low level of habilitations in the East between 
1993 and 1997 is explained by the fact that a new generation of Ph.D. students 
needed to be channeled through the system before new habilitations were 
finished in larger numbers. 

Consequently, the number of habilitations per 100 professors rose very 
significantly during the period from under 6 in 1992 to over 10 in 2004 as 
shown in Figure 4. 


Figure 4 Habilitations per 100 Professors in Germany 
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If I assume an average age for first appointment as tenured professor 
(‘Erstberufung’) of 42 years” and standard retirement at the age of 65 the 
average replacement need would be 4.3 people newly awarded habilitation 
per 100 professors each year.” Thus the ratio of new applicants to job 
openings rose from roughly 3/2 to 5/2. 


3 This number was given by the Deutsche Hochschulverband (the German associa- 
tion of professors and those awarded habilitation), cf. Hartmer (2001). Berning et al. 
(2001) find that the average age of habilitation in Bayern was 39.5 years in the period 
1993-98, although with large differences between fields. 

2% With the newly increased retirement age of 67 the replacement need is only 4 
percent p.a. Obviously these are just an illustrative back-of-the-envelope calculations as 
the age structure of professors is not uniform in particular due to large expansions of 
universities in the seventies and the reunification. 
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3.2 Differences in the Number of Professors across States 


I seek to identify differences in the provision of university services across 
states, measured by the number of professors adjusted for the relevant state 
size. I use two variables for size — the number of residents and the number of 
high school graduates. The number of professors divided by number of 
residents does not account for different age and industry structures and socio- 
economic profiles of the population between states; the number of professors 
divided by number of high school graduates seems the more appropriate 
number as the number of high school graduates measures the demand for 
professors. It would be endogenous if the decision to seek a high school 
diploma was dependent on the university quality in that particular state; which 
seems unlikely. However, if there were significant differences between states 
in the share of high school graduates that wish to study, a normalization by 
high school graduates could bias results. As number of high school graduates I 
use the average of the last five years as I assume that most students need five 
years to complete their studies.” 

Figure 5 presents the overall trend in the number of professors per 100 high 
school graduates. It shows that the ratio has deteriorated significantly from 
11.26 in 1996 to 9.43 in 2004 (or by 16%) indicating a substantial aggravation 
of the German university quality. 

Hidden behind this overall figure for Germany is a wide disparity in this 
indicator. Figure 6 gives an overview of this pattern. 

Two stylized facts are evident. First, city states (Hamburg, Berlin, Bremen) 
have higher ratios of professors to high school graduates than the other states. 
Cities should be expected to have higher ratios: they draw students from the 
hinterland because universities tend to be more concentrated in cities or 
larger towns (and many towns do not have a university at all) and high school 
graduates are more evenly distributed. While there are usually no large 
external effects of this pattern as the hinterland mostly belongs to the same 
state as the city, this does not hold for city states, which draw students from 
neighboring states. Thus they provide external benefits to the surrounding 
states. 

Second, the new ‘Lander’ have smaller ratios than the old ‘Lander’. An 
apparent combination of these two effects is found in the case of Brandenburg 
and Berlin where the former is free-riding on the latter’s universities. The 


> I do not use the number of students in that state as control for size as it is endog- 
enous to the capacity/quality of the universities because students may migrate from 
states with relative low numbers of professors adjusted for size to those with relative 
large numbers. Thus, the professor—student ratio underestimates the high performers 
and overestimates the low performers. 

% Therefore, if all high school graduates in that state wanted to study and migration 
were absent, the professor-student ratio would be five times lower. 
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Figure 5 Professors per 100 high school graduates in Germany, 1996 - 2004 
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Figure 6 Professors per 100 high school graduates in the states 
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East-West divide may have financial reasons as eastern states are less affluent 
and they might still be affected by the transition from socialist command and 
control society and economy to a democratic society and market economy. 
Table 1 (see p. 54) provides a more detailed picture for different time 
periods as there is a strong time trend. It also shows that the normalization by 
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Tablel The provision of professors across states in absolute and relative terms 


Professors Professors per Professors per 100 High 
(annual mean) 100,000 School Graduates 
Residents (annual mean) 


(annual mean) 


Federal State 1992 1998 1992 1998 1992 1996 2000 
1997 2004 1997 2004 1995 1999 2004 
Baden- 3011.17 2693.14 29.27 25.45 13.13 13.46 10.79 
Württemberg 
Bayern 3092.00 3075.71 25.89 25.02 12.78 13.33 12.17 
Berlin 1857.83 1511.43 53.66 44.59 27.50 17.08 12.53 
Brandenburg 297.17 387.29 11.66 14.97 - 4.52 3.69 
Bremen 347.17 377.29 51.05 56.93 15.36 16.71 18.11 
Hamburg 1097.17 999.00 64.43 58.04 20.03 19.74 21.09 
Hessen 1985.50 1812.29 33.15 29.84 12.10 11.66 10.88 
Mecklenburg- 442.83 501.57 24.21 28.50 - 8.26 8.06 
Vorpommern 
Niedersachsen 1843.33 1708.86 23.86 19.79 9.48 9.64 9.21 
Nordrhein- 4785.33 4562.57 26.82 25.30 9.26 9.43 8.64 
Westfalen 
Rheinland-Pfalz 913.67 895.14 23.08 22.13 9.76 10.12 9.08 
Saarland 272.33 257.29 25.13 24.13 11:93 11.88 10.20 
Sachsen 1136.67 1232.43 24.85 28.08 - 8.50 7.97 
Sachsen-Anhalt 471.67 585.57 17.17 22.68 - 6.32 6.45 
Schleswig-Holstein 482.17 509.00 17.74 18.17 7.49 8.74 8.64 
Thiiringen 532.67 610.86 21.22 25.35 - 7.18 6.54 
Deutschland 22568.67 21719.43 27.65 25.96 - 10.68 9.61 
West 17829.83 16890.29 27.89 25.40 11.11 11.34 10.36 
(without Berlin) 
East 2881.00 3317.71 20.28 24.17 - 7.08 6.55 


(without Berlin) 


Data sources: see appendix. 


the number of residents produces a somewhat different picture. City states 
still provide a much higher relative number of professors, and Brandenburg is 
still the taillight. Yet the ranking is different and there is no longer a clear 
East-West difference. The ratio of high school graduates to population obvi- 
ously differs substantially across states. 

Is it possible to shed some light on the reasons behind these stylized facts? I 
offer some very tentative evidence on this issue in the section below. 
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3.3 What Determines the Provision of Professors? - Some Mickey Mouse 
Econometrics 


In this section I seek to determine whether there is any pattern in the pro- 
vision of professors per 100 high school graduates. I thus run an OLS 
regression with robust standard errors with the number of professors per 100 
high school graduates as endogenous variable for the sample of 16 German 
states in 2003. As explanatory variables I use a West dummy, budget per 
capita and a political index. The West dummy is one if the state is an ‘old’ — 
Western - state, zero if it is a new state and 0.5 in case of Berlin. The political 
index is one if the prime minister is from a conservative or liberal party (CDU, 
CSU, FDP, Schill party) and zero if he or she comes from a left/labor party 
(SPD, Green Party, PDS).” As changes in the university system take effect 
only gradually I use budget data and political index data for the last ten years 
and discounted the values for past years with 5 percent p.a. Lastly, I used as a 
measure for relative size of states the share of state GDP in federal GDP. 
The results are given below. 


Table 2 Cross-state OLS regression, endogenous variable: number of 
professors per 100 high school graduates 


d) (2) (3) 

Variable Coefficient Coefficient Coefficient 
(t-statistics) (t-statistics) (t-statistics) 

West-Dummy 7.281299*** 7.36203*** T252577*** 
(5.41) (5.80) (6.30) 

Political index .2484231* .2550074 24595072 
(1.35) (1.43) (1.46) 

State share of population — 3.076724 

(— 0.26) 

Share of state GDP — .0052421 
(— 0.05) 

Budget per capita 325:512*** 318.4054*** 326.8929*** 
(5.21) (4.99) (6.18) 

Constant — 4.886064 — 4.593907* — 4.93211** 
(- 2.02) (— 1.80) (2.32) 

No. obs.: 16 F(4,11) = 13.96 F(4,11) = 14.06 F(3,12) = 20.31 
Prob > F = 0.0003 Prob > F=0.0003 Prob > F = 0.0001 
R? = 0.836 R? = 0.837 R? = 0.835 


***/**/* indicate significance at the 1/5/10 percent level. * significant at the 17 percent 
level. Data sources and data description are found in appendix 5.2.2. 


27 Alternatively I used the party affiliation of the science or education minister, which 
affected results only mildly. Arguably the finance minister and the prime minister have 
more influence on the education policy than the education minister. 
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The preliminary results show a strong East-West divide in the level of 
professor provision and a strong positive effect of per capita budget, but not 
an effect of relative size as measured by state GDP as a share of federal GDP. 
The effect of budget per capita captures some of the city state effect as city 
states have larger budget per capita. Lastly, there is some weak indication that 
the level of provision of professors is influenced by the political stance of the 
incumbent with conservative politicians tending to spend more on professors 
in relative terms. The variable never reaches normal significance levels; yet 
this may be due to the low degrees of freedom. 

These results are only suggestive because of the very small number of 
observations, but they show a direction in which a more profound econometric 
analysis could go. There seems to be some supportive evidence for the 
importance of different policy stances as expressed in the variable 0 in the 
model as well as the endogenous budget constraint. 


3.4 Student Migration across States 


Next I seek to determine whether students do in fact react systematically to 
differences in university quality across states as measured by the number of 
professors per 100 high school graduates in that state. This was one of the 
guiding assumptions in the theoretical part of the paper; it makes the quality 
of the educational system a strategic variable in the competition for high 
skilled people. 

I employ a standard gravity equation that has been used widely in empirical 
analyses of international trade and factor movements (Deardorff 1995, 
Frankel et al. 1996 and many others). It relates positively the gross flows of 
freshmen from state i to state j (FRESHMEN_ij) to the relevant sizes of the two 
states, which are approximated for our problem by the numbers of high school 
graduates in both states (HS-GRAD). Gross flows are negatively influenced by 
the DISTANCE between states as measured by the distance in kilometers 
between state capitals or, in case a state had two dominant centers, by the 
population weighted distances between these centers and the corresponding 
state’s center.”® As usual, I included a dummy for adjacent states (ADJACENT) 
because the variable DISTANCE may not capture the relevant distance appro- 
priately in that case. Because there is a strong East-West difference in Ger- 
many, separate dummies EAST were included that took on the value 1 if the 
home state or the host state was an East German state, 0.5 in the case of 


238 For instance, for a migration from Saxony to the Saarland I used the distances 
between Leipzig and Saarbriicken and Dresden and Saarbriicken weighted by the rel- 
ative population shares of Leipzig and Dresden. All details on the construction of the 
variables including their sources are in appendix 5.2.3. 
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Table3 Determinants of student migration flows between German states 


dependent variable: In(FRESHMEN_ij) 


() 


Coefficient (t-value) 


(2) 


Coefficient (t-value) 


In(HS-GRAD_i) 
In(HS-GRAD_j) 
ADJACENT 
EAST_i 

EAST_j 

PROF_i 

PROF_j 
In(aBS-PROF_i) 
In( ABS.PROF_j) 
In(DISTANCE) 
constant 


.8756688*** (13.81) 
1.000499*** (19.41) 
.7550757*** (6.99) 
.2727333** (2.16) 
.1931158* (1.66) 
.0010049 (0.08) 
.086771*** (7.36) 


— 1.03381*** (— 11.81) 
— 7.697556*** ( — 6.89) 


.9010989*** (8.97) 
— 0590754 (— .53) 
.7452368*** (7.08) 
.2468845** (1.97) 

.3207784*** (2.60) 


— 0302408 ( — 0.23) 
1.028326*** (8.02) 

— 1.055287*** (— 11.60) 
— 4.010802*** ( — 6.07) 


F-statistic F(8, 231) = 153.40*** F(8, 231) = 148.73*** 
R? 0.86 0.87 
Number of obs. = 240 


***/**/* Indicate significance at the 1/5/10 percent level. 
Data sources and data description are found in appendix 5.2.3. 


Berlin, and zero otherwise. PROF measures the number of professors per 100 
high school graduates and was calculated separately for the sending and the 
receiving state. Alternatively I use the log of absolute number of professors in 
that state (In(ABs_PROF)).”” The regression model thus is 


In(FRESHMEN_ij) = by + b; In(HS-GRAD_i) + b, In(HS-GRAD_j) 
+ b; In(DISTANCE) + by PROF_i+ bs PROF_j + bg ADJACENT 
+b, EAST_i+ bg EAST_j + U 


ij > 
where u, is a disturbance term with zero mean and normal distribution. 
Regression was made with robust standard errors. Results are given in 
Table 3. 

As expected, the size effect is strong and significant: larger states receive 
more out of state students and conversely more students emigrate from large 
states for their studies. Migration is strongly negatively affected by distance 
between states and 2.1 times as many students migrate between adjacent 
states. East German states (and Berlin) receive more out of state students and 
more students emigrate from them. There is no statistically significant evi- 
dence that states with higher university quality experience less gross outflow 


2 I used a Box-Cox transformation which pointed towards the superiority of the log- 
log specification. I could not have used the log of professors per high school graduates as 
this would have estimated twice the size effect of the number of high school graduates. 
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of students. However, states with a better ratio of professors per 100 high 
school graduates attract out of state students more strongly. That makes net 
flows strongly respondent to differences in educational quality.” The second 
model paints basically the same picture, yet the strong correlation between 
HS-GRAD and ABS_PROF makes an interpretation of the point estimates and 
significance levels of these four variables very difficult (see appendix 5.2.3). 
Student migration is thus shaped by pull factors rather than push factors: The 
results provide empirical support for my basic hypothesis that university 
quality is a strategic variable for the states to attract students and enhance 
human capital endowment. 


4 Conclusion 


The decentralized provision of free education in a federal system is bound to 
produce externalities that give rise to inefficiencies. Because people are 
mobile within a federation they are free to let one state invest in their human 
capital and have another state benefit from the returns to this educational 
investment. Previous literature has focused on the underprovision of educa- 
tion because high-skilled labor is mobile: states anticipate that they may 
produce human capital for the benefit of other states (Justman and Thisse 
1997, Poutvaara and Kanniainen 2000, Büttner and Schwager 2004, Südekum 
2005). 

This paper has taken a different perspective: Because mobility occurs 
largely first at the stage of tertiary education, when prospective students have 
already invested significantly in their human capital, states may have an 
incentive to overprovide tertiary education to attract students, thereby free- 
riding on the primary and secondary education provided by other states. 
(They may underprovide primary and secondary education.) Such a strategy 
makes sense if students are imperfectly mobile after graduating from uni- 
versity, for example because universities create spillovers for the regional 
economy and create jobs for high skilled individuals. If this is so, the number 
of students influences the human capital endowment of the state. 

I have modeled two federal states which produce with immobile labor, 
internationally mobile capital and interregionally mobile, but internationally 
immobile human capital. The decision to provide educational capacity is 
political, and it impacts on the market allocation of mobile capital which is 
attracted by a higher human capital endowment. Mobility of human capital 
occurs at two stages — high school graduates deciding in which state to take up 


3% Of course, ideally migration flows and explanatory variables should have been on a 
university level because students go to specific universities rather than to certain states. 
Unfortunately this data is not available. 
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their studies and university graduates seeking employment throughout the 
federation; it is imperfect at both stages. It turns out that other things being 
equal larger states invest relatively less in tertiary education than smaller 
states — the latter have a better professor-to-student ratio than the former. 
This creates a distortion which is due to the strategic interaction states find 
themselves in. 

The empirical part sought to portray stylized facts of the German situation 
in tertiary education. First, there is a strong downward trend in the capacity of 
universities relative to its demand, i.e., the number of high school graduates. 
This indicates a significant deterioration in the quality of tertiary education. 
Second, the number of professors per 100 high school graduates differs sig- 
nificantly across German ‘Länder’ with city states producing large positive 
effects for their neighboring states. There is a clear East-West gap with the 
new ‘Länder’ providing a lower number of professors per 100 high school 
graduates. Except for the city states, Bayern, Hessen, Baden-Württemberg 
and the Saarland show an above average provision of professors. There are 
some indications that this pattern can be explained by differences in budget 
constraints and possibly in political preferences. Third, in their decision to 
migrate, students react systematically to differences in relative educational 
capacities. 

These inefficiencies obviously call for policy interventions if education is to 
remain in the authority of the states rather than the center. There are good 
reasons to keep education decentralized (and possibly to decentralize even 
more): competition is a revelation procedure for new — better — solutions, 
serves as a laboratory for better educational policies and (thereby) produces 
dynamic efficiency gains. Competition enhances educational outcomes, if 
externalities can be adequately internalized. These externalities occur at each 
stage of the system at which mobility occurs. One possibility to internalize 
these externalities is to privatize the costs of education through user fees 
thereby eliminating states’ incentives to free-ride. This may create other dis- 
tortions if individuals are credit constrained or risk averse and thus shy away 
from investing in education. Thus if this solution is ruled out for efficiency or 
distributional reasons, a voucher system at the federal level may constitute 
such a solution. Students (and possibly pupils) are given vouchers for their 
education which they can convert into educational services at any institution 
within the federation. Institutions can redeem them with a central clearing 
institution which is financed at the federal level. Thus, financing and authority 
to regulate educational institutions are effectively delinked, thereby ensuring 
the advantage of decentralized organization and the efficiency of central 
financing. 

Yet, the current university system may create a third form of externality, 
which has not been subject of this paper. A state provides external benefits to 
other states if it exports more educational services in educating and promoting 
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junior staff that is hired by other states than the educational services that it 
imports by hiring junior staff or newly appointed professors from out of state. 
This is the case if junior staff — Ph.D. candidates, assistant professors, or 
research associates seeking a habilitation — are paid more than their marginal 
value product to the university minus the annuity of their educational 
investment. This issue is left for future research. 


5 Appendix 
5.1 The determinants of net capital exports 


With constant population the steady state condition for capital accumulation 
requires that the capital stock owned by a society is constant. In other words, 
depreciation is exactly replaced by new investment financed out of savings, 
regardless where the income was generated or the capital installed: 
s Y = ô KS. If the depreciation rate is high, the capital owned by the society 
will fall short of the domestically installed capital, K?<K*. As marginal 
returns to capital for the domestically owned capital stock exceeds world 
returns (Fx(K},H,L)>r"”), the economy attracts foreign capital. This is 
shown in figure 5.1. Recall that K* is given by the requirement that its mar- 
ginal rate of return equal the world rate of return, cf. eq. (1). 

For low depreciation rates domestically owned capital will exceed domes- 
tically installed capital. The exported capital earns the world market rate of 
return; therefore the equilibrium condition is represented by the intersection 


Figure 5.1 Steady state capital stock of a small open economy 
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of the ray from the origin with slope ö representing total depreciation and the 
graph that depicts total saving which has the slope 7” s. 
Specifying the production function of the Cobb-Douglass type 


(5.1) Y? = F(K,H,L) = K«HP LI“ 


and using eq. (1) I derive the amount of installed capital as 


Ww ra 
(5.2) K*= = H’ zu l 


In a closed economy the steady state condition s Y = ô KS would read as 
Ò KS = s (KS)“ HP Let 

and the steady state capital stock would be 
zs fs r 

(5.3) KS = 5 He | . 


Obviously the closed economy steady state capital stock equals the open 
economy capital stock if it falls short of the domestically installed capital stock 
K*, but it is smaller if capital is exported as the marginal return to capital is r” 
in case of capital export, but lower if the economy is closed. Moreover, if the 
closed economy capital stock in the steady state exceeds K* then if this 
economy is opened it will export capital. 


(5.4) KS>K* = KS>K* A K>E 
i KS<K* > K5>K* 1 KS=K5 

Thus capital export occurs if sa > r” ô, where I have made use of eqs. (5.2) - 
(5.4). Assuming that the world production function is of the same Cobb- 
Douglass type (but possibly with other parameter values) and noting that 

F F 6 Ô 
Fk = K andae =r the small open economy will import capital iff s < s” Fo 
where superscript w indicated world values. For the same technologies this 
condition reduces to KS > K* = s<s" . If the economy’s savings rate 
exceeds (falls short of) the world savings rate, it will export (import) capital. 
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5.2 Data Sources 


5.2.1 Data for Sections 3.1-3.2 

Data used in section 3.1 and 3.2 were provided by the Federal Office of 
Statistics (Statistisches Bundesamt), number of professors and number of 
habilitations by state are based on data in Fachserie 11 Reihe 4.4 and were 
provided as excel files. Data for professors start in 1992 only as the new law on 
university statistics (Hochschulstatistikgesetz) of 1990 was implemented 
beginning 1992. Population was taken online (www.destatis.de); high school 
graduates by state were provided as files by the Statistisches Bundesamt.*! 
Professors per 100 high school graduates were averaged over the last five 
years. Students cross-state migration flows were obtained by the Statistisches 
Bundesamt, Fachserie 11, R 4.1 and refer to the wintersemester 2004/05. 


5.2.2 Data and Descriptive Statistics for Section 3.3 

Budget data for section 3.3 are taken from the following sources: Bundes- 
ministerium der Finanzen, “Finanzbericht 2005”; Köln 2004, Bundesministe- 
rium der Finanzen, “Finanzbericht 2000”; Bonn 1999, and Statistisches Bun- 
desamt, “Volkswirtschaftliche Gesamtrechnung — Bruttoinlandsprodukt at 
http://www.statistik-portal.de/Statistik-Portal/de_jb27_jahrtab65.asp.” 

The political index (PI) was based on the political dummy PD, for year i 
which is one if in that year the state prime minister is conservative (or liberal, 
or from the Schill party) and zero if she or he is socialdemocratic (or green or 
socialist) and is aggregated over the last ten years according to following 
formula: 

2003 


PI = 5, PD, (1.05) 


i=1994 


An analogous aggregation was made for budget per capita used in the 
regression. Data for the political index were taken from www.deutschland.de 
and the pages for the states linked to this page.” 

Descriptive statistics and cross correlations are provided below. 


3! Earlier data (up to 1992) are also published in Kultusministerkonferenz (http:// 
www.kmk.org/statist, Veröffentlichung Schüler, Klassen, Lehrer und Absolventen der 
Schulen 1982 bis 1991, Nr. 121). 

32 For instance, http://www.baden-wuerttemberg.de/fm/1899/Regierungen_BW.pdf, 
http://www.niedersachsen.de/master/C2860930_N1461178_L20_D0_I198.html, http:// 
www.landeshauptarchiv.de/geschichte/kabinette.html. See also Biographisches Hand- 
buch der deutschen Landesregierungen nach 1945 (2006) Miinchen: K.G. Saur. 
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Table 5.1 Descriptive statistics for variables used in the regression of Table 2 
(section 3.3) 


Variable; Obs. Mean Std. Dev. Min Max 
Professors per 100 high 16 10.17341 4.379247 3.610794 20.88965 
school graduates 

Budget per capita 16 .0291702 0101174 0197562 0495509 
Political index 16 3.295331 3.119904 0 7.7217 
Share of state GDP 16 6.25 6.467188 1.087774 22.07029 
West-Dummy 16 .65625 4732424 0 1 

City state dummy 16 1875 4031129 0 1 
Population 16 5158229 4841514 663129 1.8le + 07 


Table 5.2 Cross correlation for variables used in the regression of Table 2 
(section 3.3) 


Profs per 100 Political Share of West- Popula- Budget 
high school index state GDP Dummy tion per 
grads capita 


Profs per 100 high 1.0000 
school grads 


Political index — 0.0125 1.0000 

Share of state — 0.0092 0.1682 1.0000 

GDP 

West-Dummy 0.5491 — 0.1952 0.4659 1.0000 

Population — 0.1258 0.1224 0.9851 0.3827 1.0000 
Budget per capita 0.5392 — 0.0459 — 0.5289 — 0.2654 — 0.5591 1.0000 
City state dummy 0.8007 — 0.1147 — 0.2671 0.1857 — 0.3310 0.8854 


5.2.3 Data and Descriptive Statistics for Section 3.4 


Data sources are given in appendix 5.2.1. Distance was calculated as the 
distance in kilometers between state capitals. Data were compiled by online 
distance tables and if not available with the help of route planers “Tank & 
Rast” (http://www.tank.rast.de/services/entfernungstabelle/entfernungen.htm, 
November 2005) and (http://www.tank.rast.de/services/routenplaner/), 
respectively. If a state had more than one important metropolitan area, a 
population weighted average of the relevant distances was calculated (cf. 
fn. 28). In particular, I used for Baden-Württemberg, Stuttgart and Karlsruhe, 
for Bayern, München and Nürnberg-Fürth-Erlangen, for Hessen, Frankfurt, 
for Nordrhein-Westfalen, Düsseldorf and Köln, for Sachsen, Dresden and 
Leipzig and for all other states the state capital. 
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Descriptive statistics are given in Table 5.3 below. 
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Table 5.3 Descriptive statistics for the student migration model of section 3.4 


Variable Obs Mean Std. Dev. Min Max 
FRESHMEN_ij 240 327.525 458.1392 1 2898 
(HS-GRAD_i 240 16153.44 14479.45 2575 57409 
ADJACENT 240 2416667 428988 0 1 

EAST_i 240 3125 4644811 0 1 
PROF_i 240 10.17341 4.24905 3.6108 20.8897 
ABS_PROF_i 240 1555.245 1406.356 315.8367 4815.697 
DISTANCE 240 409.375 185.5323 33 812 


Cross correlations are given in Table 5.4 below. 
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Tertiary Education in a Federal System 


Comment by 


STEFAN VOIGT 


Schulze’s paper deals with a straightforward question: to what extent do 
federal systems of free tertiary education (such as Germany’s) suffer from 
positive externalities that give rise to strategic behavior of the states in the 
provision of university education? Schulze is definitely not the first to ask this 
question but he adds an interesting twist to it: previous papers assumed that 
states tried to free ride on the free tertiary education provided elsewhere; in 
the aggregate this would lead to an underprovision of university education. 
Schulze now takes into account that university graduates might have a bias to 
stay close to where they graduated and that they could produce positive 
regional spillovers which might raise GDP regionally. This implies that there 
might not be incentives to underprovide tertiary education after all. Indeed, 
its overprovision is a possibility. But states overproviding tertiary education 
might have an incentive to underprovide both primary and secondary edu- 
cation. 

His model is based on a neoclassical production function and Schulze is 
interested in steady states. The model nicely captures that human capital 
formation (a consequence of the education budget) does not only increase 
GDP directly but also indirectly by attracting physical capital (because the 
return to capital rises initially). Finally, the increase in both physical and 
human capital leads to wage increases (as labor is assumed to be immobile). 
This means that GNP increases. The Nash-equilibrium in a model with two 
states predicts that larger states (in terms of population size) will provide 
more tertiary education. 

The empirical section is interested in identifying differences in the provision 
of tertiary education — operationalized as the number of professors per high 
school graduate. The provision of professors is endogenized as the next step. 
The empirical section further deals with the question whether the differences 
in the number of professors between the German states are a significant 
variable in explaining student mobility. This is implemented by drawing on a 
gravity model. The results show that there is considerable variation in the 
number of professors which can be explained by a dummy for the West (states 
in the western part of Germany employing significantly more professors) and 
the state’s per capita budget. There is, however, no clear-cut effect of state size 
on the number of professors which means that the theoretically derived 
prediction is not confirmed. 
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The interesting twist in Schulze paper is the possibility that states in a 
federation might be oversupplying tertiary education while undersupplying 
primary and secondary education. My comment will, hence, focus on this 
aspect. Before picking it up, it might, however, be in order to shortly discuss 
the proxies used in the empirical section. 

As a proxy for the provision of tertiary education, Schulze’s uses the 
number of professors employed full time at universities, technical universities 
and pedagogic universities normalized by the number of high school gradu- 
ates. He thus excludes professors employed at universities of applied sciences 
and other more applied organizations. This is one of a number of potentially 
possible quantitative measures. I am not sure, however, whether it is the most 
straightforward one. Positive regional spillovers might also be generated by 
the more applied universities; their claim to being more applied suggests that 
they aim at such spillovers. Using the number of professors as a proxy implies 
that it is primarily their number that is determining differences in the quality 
of tertiary education. A host of other factors might be relevant too: the 
number of assistants, the quality of the library, the laboratories, the computer 
pools and so forth. I wonder whether simply drawing on a state’s budget for 
tertiary education would not have been as good a proxy. 

Student migration flows across states are conjectured to be influenced by 
the number of professors in both the “exporting” as well as the “importing” 
state. Schulze finds that the number of professors in the importing state is 
highly significant whereas the number in the exporting state is insignificant, in 
other words that there is more of a pull than a push effect. This seems to make 
perfect sense, except that students do not seem to choose a state to study in 
but rather a university to study at. Schulze is well aware of this problem (see 
footnote 30) but it remains a problem. Some German state governments are 
said to consider some universities as their prestige projects whereas others 
barely survive. This will supposedly not be reflected in Schulze’s proxy. 

It seems straightforward to assume that student migration flows are at least 
partially determined by differences in the quality of university education. Yet, 
it would also seem straightforward that students base their decisions on 
readily available indicators. It would, hence, be interesting to look at the 
correlation between the Schulze measure and other readily available rankings 
that have been widely publicized and discussed in recent years in Germany. 

But let us move to the possibility that tertiary education could be over- 
provided while primary and secondary education could be underprovided. 
The paper shows that the professor-per-100-high-school-graduates ratio has 
deteriorated from 11.26 in 1996 to 9.43 in 2004. This is a decline by about one 
sixth in less than a decade. There certainly does not seem to be any race to the 
top with regard to university education in Germany. The recent reform of the 
German payment scheme for professors basically means that salaries were 
substantially reduced. To be a professor has thus become less attractive rel- 
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ative to other occupations. It is straightforward to predict that average quality 
will suffer from this reform, another indicator for the absence of a race to the 
top. 

Schulze might argue that the attempt to free ride on primary and secondary 
education was the more important part of his argument anyways. Free riding 
on others presupposes that high school graduates are highly mobile before 
beginning tertiary education (and substantially less so after having finished 
tertiary education). Empirically, we know, however, that around two thirds of 
all high school graduates who go on to university remain in the state in which 
they received their high school diplomas. The notion that states could free ride 
on primary and secondary education implies that basic education could be low 
quality whereas university education could be top-notch. Yet, empirically, the 
correlation between states offering high quality basic education and those 
offering high quality university education seems to be extraordinarily high 
(e.g. Bavaria, Baden-Württemberg). And this seems to make intuitive sense 
too: parents who have graduated from university and stayed in the state would 
supposedly consider moving out if basic education for their children was not 
high quality too. 

In sum, this paper deals with an interesting question and adds a fascinating 
new twist to it. The theoretical model nicely captures the most relevant basic 
ideas. The empirical section presents interesting data - and shows that part of 
the theoretical predictions cannot be confirmed. Yet, there is scope for more 
detailed empirics as Schulze terms some of his econometrics “Mickey Mouse” 
— as the number of observations is very low. 
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by 
GUSTAVO CRESPI and ALDO GEUNA* 


1 Introduction 


There is increasing recognition in the UK and other OECD countries of the 
importance of scientific research in providing the foundations for both inno- 
vation and competitiveness. This has resulted in increased public funding for 
research in the UK and elsewhere. At the same time, there is a lack of sys- 
tematic evidence on how such investments can lead to increasing levels of 
scientific output and, ultimately, to better economic performance. Much of the 
available literature concentrates on the effects of public funding of basic 
research on either firms’ innovative activities (see among others Cohen, 
Nelson and Walsh 2002; Klevorick et al. 1995; Jaffe 1989; Narin, Hamilton and 
Olivastro 1997) or firm performance (Adams 1990), bypassing the question of 
how to measure scientific output. The reasons for this are the difficulty of 
identifying a stable causal relationship between the resources spent on the 
science budget and ‘intermediate’ scientific outputs. This difficulty originates 
from the dynamic nature of this relationship. There is a persistent and 
therefore recursive feedback between inputs and outputs, which is exacer- 
bated by lack of appropriate information for analysis. Among the few studies 
that have attempted to address the problem are Adams and Griliches (1996) 
and Johnes and Johnes (1995). This study is based on and further develops 
Adams and Griliches’s methodology. 

The national science budget comes from several sources. For example, the 
UK higher education sector received a total of £4,035 millions for research 
and development in 2001, financed by the Office of Science and Technology 
(OST) via the research councils (£942), Higher Education Funding Councils 
(HEFC) (£1,474), other UK sources such as direct government (£238), higher 
education institutions (£166), non-profit organisations (£660) and business 
enterprises (£250), and funding from other countries or supranational insti- 
tutions (£304). These contributions are allocated within the system according 


* The authors are grateful to Paul David, David Humphry, Ben Martin, Fabio Mon- 
tobbio, and Ed Steinmueller, and participants in the Use of Metrics in Research 
Assessment Workshop (Oxford University-Brasenose College, September 2004) and 
S&T Indicators Conference (Leiden, September 2004), for comments and suggestions. 
The authors would also like to thank Evidence Ltd for supplying some of the data for the 
econometric analysis. This paper is derived from a report commissioned by the Office of 
Science and Technology, Department of Trade and Industry. All mistakes, omissions, and 
views expressed are the sole responsibility of the authors. 
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to scientific field and research institution, to provide the resources needed for 
research. 

The scientific process produces several research outputs that can be clas- 
sified into three broadly defined categories: (1) new knowledge; (2) highly 
qualified human resources; and (3) new technologies. This paper focuses on 
the determinants of the first two types of research output, which are the most 
closely related to the science research budget. There are no direct measures of 
new knowledge, but several proxies have been applied in previous studies. The 
two that we use in our study, which are also the most commonly used meas- 
ures are publications and citations. These are incomplete proxies for the 
production of new knowledge and have several shortcomings (Geuna 1999). 
Highly qualified human resources have been proxied by the total number of 
graduate students that have completed their studies. 

In this paper we focus on the determinants of university research output (as 
measured by publications, citations and numbers of graduate students) in the 
UK. We use an original dataset that includes information for the 52 ‘old’ UK 
universities across 29 scientific fields for a period of 18 years (1984/85 —2001/ 
02). The paper does not aim to produce exact indicators of the dynamics of the 
science system, on the basis of which to draw strong policy conclusions and we 
fully acknowledge the limitations of the input-output data we use. 

The paper is structured as follows. In Section 2 we present the methodology 
and the data sources; we depart from the traditional static knowledge pro- 
duction function model to estimate different dynamic panel data specifica- 
tions. In Section 3 we present and discuss the results of the estimations. In 
Section 4 we use the residuals of our fitted knowledge production functions to 
evaluate the evolution of UK scientific productivity. Finally, in the conclusion 
we discuss the limitations of this study and suggest possible further develop- 
ments. 


2 Methodology and data sources 


Our methodological approach develops the standard knowledge production 
function model of Adams and Griliches (1996). They use the expression: 


(1) Ya = t+ BWr), +y:Xn+ un, i=l.... N 


where y, is the (log) output of the research ‘intermediate’ output (papers and 
citations) by field i and time 1. W(r), is (the log of) a distributed lagged 
function of real past R&D expenditure and X; is a vector of the control 
variables. The main focus of this analysis is on £, the elasticity of the research 
output with respect to research input and the measure of local returns to scale 
in research. Diminishing (constant or increasing) returns predominate when 


<1). 
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In order to build a science capital stock we need: the time length over which 
the past investments in university R&D are considered to be relevant for the 
current research; and a weighting scheme to account for past university R&D. 
This is where our methodology departs from Adams and Griliches (1996). 
While they present the results for three and five year distributed lags of R&D, 
where the weighting pattern is completely ad hoc, we search for a lag struc- 
ture, and develop a procedure to estimate a flexible and ‘data driven’ lag 
structure. 

There are two dominant models. First, the Autoregressive Distributed Lag 
(ADL) model, which assumes a very flexible and unrestricted relationship 
between (log) R&D inputs and outputs, but at the cost of estimating a large 
number of parameters (see for example, Guellec and van Pottelsberghe 2001 
and Klette and Johansen 2000). Second, the Polynomial Distributed Lag 
(PDL) or Almon Model, which specifies the weights as polynomial functions 
of a particular estimated degree (Crespi and Geuna 2004). These log linear 
models imply a strong complementarity between the knowledge inputs.! In 
other words, the greater the initial knowledge, the greater will be the amount 
of knowledge obtained from a given amount of R&D. The more knowledge is 
produced, the more it can be recombined to produce new knowledge. For- 
mally we will assume that: 


J 
(2) Kı = Il Fij 
j=0 


In this paper we present the results of the PDL Model; we used the ADL 
model in another paper and obtained consistent results. Let us now define the 
following ‘finite’ distributed lag model: 


q 
6) Ya = 0 +Y Biting + YyXat+ un 1=1,.....N 


j=0 


Although a model like (3) in theory can be estimated in a straightforward 
manner, there is the potential problem of very long lags in which case the 
multicollinearity is likely to become quite severe. In such cases it is common 
to impose some structure on the lag distribution, reducing the number of 
parameters in the model. It is in this context that the PDL model can be 
useful. The approach is based on the assumption that the true distribution of 


! By complementarity we mean that marginal productivity of current R&D invest- 
ment tends to zero if past R&D investment also tends to zero. This assumption is par- 
ticularly apt in the case of science where we ‘stand on the shoulders of giants’ to build 
new knowledge. 


74 Gustavo Crespi and Aldo Geuna 


the lag coefficients can be very well approximated for by a polynomial of a 
fairly low order: 


(4) B; = ôo + Oyj + + +ô, j=0,.-.4 =p 


The order of the polynomial, p, is usually taken to be quite low, rarely 
exceeding 3 or 4. By inserting (4) into (3), one can estimate a transformed 
model where the estimated coefficients are the deltas that can be put back 
into (4) in order to recover the original weights. In addition to the p+1 
parameters of the polynomial, there are two unknowns to be determined: the 
length of the lag structure, q, and the degree of the polynomial, p. Here we 
follow the non-standard procedure of setting the length of the lags using a 
priori information and then searching for the degree of the polynomial 
function. 

The usual standard procedure is to use the same dataset first to search for 
the optimum time lag (using some information criteria) and then, taking the 
best lagging as true, to search for the optimum polynomial function. However, 
this sequential search approach carries the problem that unless the test sta- 
tistics are overwhelming, the true significance levels in the tests remain to be 
derived, and the true distribution of the resulting estimator is unknown. 
Following the evidence in Crespi and Geuna (2004) for a large set of OECD 
countries, we set a lag length of 6 years for publications and research students, 
and 7 years for citations. 

Assuming that we know the right lag length we proceed by looking for the 
right polynomial function. We start by using a fifth degree function and 
proceed by testing sequential unit reductions in the degree. It is important to 
note that in order to retain the appropriate significance level in each step we 
used a very low individual significance level. The PDL model also implies a 
set of constraints on the unrestricted model (without a specified functional 
form for the lags). For example, if the known lag length is 6 and we use a third 
degree polynomial function, we are implicitly imposing three constraints. In 
addition, we have endpoint constraints, which allow the lag distribution to be 
‘tied down’ at its extremes. These endpoint constraints capture the idea that 
there is no effect of R&D on the research outputs before the current period 
and also that there is no effect from the research inputs after the maximum 
lag. That is, we need to impose: 


(5) Bı =0 and Ban =0 


In total we have five constraints. One way to validate the PDL model is to test 
whether these constraints are valid, which can be done by using a simple chi- 
square test. 
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Finally, the PDL model also requires exogeneity of R&D. We carried out a 
bivariate Granger causality test. Following Rouvinen (2002), we implemented 
the test using a dynamic panel data in differences (DPD-DIF) model, where the 
first differences of the dependent variables are regressed on lags of the first 
differences of the dependent and independent variables. The findings (not 
reported here for reasons of space, but available from the authors) would suggest 
that there is a two-way causality between R&D, and publications and citations. 
This may result in a biased estimation of the elasticity coefficients. Given the 
problems with available data, and the experimental nature of our work, we 
acknowledge that our estimations will be biased, and, taking a conservative 
approach, we nevertheless decided to use the PDL model because it allows us to 
use the level variables and therefore to maintain a high level of information, 
which is crucial in the case of variables such as ours which are very noisy. 

To estimate the model we used the SPRU science field database.” The 
dataset includes information on 52 old universities covering 29 scientific fields, 
over an 18 year period 1984/85 to 2001/02.* The 52 old universities considered 
provide a good representation of the scientific research carried out in UK 
universities; in 2001/02 research grant and contract income for these uni- 
versities accounted for 87% of total UK research grant and contract funding. 
The dataset has four variables (not including institution and field ids): infor- 
mation on total research grant and contract income;* number of publications; 
number of citations;> and total number of graduate students. 

In the following sections we present the results of the field level estimates of 
the science production function for publications, citations, and graduate stu- 
dents. Because information about publications and citations is only available 
at field level, we cannot estimate a knowledge production function for each of 
the 29 fields. We need to aggregate the micro fields into more broadly defined 
categories by mapping the 29 fields into the 4 broad categories in the OECD 


? A detailed description of the procedure used for the development and content of the 
datasets can be found in Crespi and Geuna (2004). 

3 The 52 old universities do not include the Open University, Cranfield University, the 
independent University of Buckingham (not in University Statistical Record statistics) or 
Lancaster University (not in the Higher Education Statistical Agency statistics). Due to 
problems with the archiving of the University Statistical Record data, London University 
data are the sum of all its colleges. Not all the universities are active in every scientific 
field, every year. 

4 Total research grant and contract income includes all direct research funding 
received from the research councils, industry, the EC, foundations, etc. Total research 
grant and contract income accounted for 38% of total research income in 1988/89 
increasing to about 60% in 2000/01 (http://www.ost.gov.uk/setstats/5/t5_1.htm; accessed 
26/1/2006). We were not able to obtain total research income broken down by scientific 
field because this breakdown of HEFC funding by institution and subject area for the 
whole period was not available. 

5 The source of publication and citation numbers is the Thomson ISI(R) ‘National 
Science Indicators’ (2002) database. 
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statistics. The four-macro fields analysed were: natural sciences; engineering; 
medical sciences; and social sciences. 

Table 1 summarises the main research outputs used in this section. Across 
the entire period, there is a remarkable stability in the distribution of research 
outputs by field. Broadly speaking, natural sciences and the medical sciences 
together account for 75% and 85% of total publications and citations 
respectively, in the UK. The remaining percentage is split between engi- 
neering (15% and 8% respectively) and the social sciences (10% and 6% 
respectively). The picture changes dramatically when we focus on graduate 
student research output where the importance of the natural sciences declines 
to slightly over 30% at the end of the period, while the medical sciences 
increases from 9% to 13%. Taken together, these two macro fields have a 
much lower output share (45% at the end of the period). Engineering remains 
stable at around 18% for the period, while the social sciences show a sys- 
tematic growth from 28%, to 36% towards the end of the period. 

In order to account for the ‘truncation problem’ in the citations for the most 
recent years, the citations variable was adjusted. One way of controlling for 
truncation is to use what Hall, Jaffe and Trajtenberg (2001) describe as the 
fixed effect approach. This involves scaling citations counts by dividing them 
by the average citation count for a group of publications to which the pub- 
lication of interest belongs. Thus, a publication that received say 11 citations 
and belongs to a group in which the average publication received 10 citations, 
is equivalent to a publication that received 22 citations, and belongs to a group 
where the average number of citations is 20. The groups were defined in terms 
of scientific field and year and the scaling index was computed using the ISI 
dataset at world level. 

On the basis of these data sources we 

— estimate the science production function for the OECD macro fields 
(natural sciences, engineering, the medical sciences, and the social sciences) 
using information on 29 science fields available for the UK, 

— examine the changes in productivity growth across fields. 


3 The UK knowledge production function estimates 


In this section we present the results of the field level estimates of the science 
production function for publications, citations, and graduate students.” The 

6 The citation count is affected by the time span allowed for the papers to be cited: for 
example, papers published in 2000 can receive citations in our data only from papers 


published in the period 2000-2001; they will be cited by papers in subsequent years, but 
we do not observe them. 


7 A national level science production function model was statistically rejected in 
favour of four very broadly defined macro-fields. 
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four macro fields analysed are: natural sciences, engineering, medical sciences, 
and social sciences. For each of these macro fields, the aim was to estimate a 
science production function as follows: 


(6) yeaa +P Wi ty XE tui, i=1,....N;F=1,..J 


it? 


where y, is the (log) output of the research ‘intermediate’ output (papers, 
citations, and graduate students) by scientific micro field i (we have 29 sci- 
entific micro fields classified into the 4 broad fields listed above) and time t 
(period 1984-2001). W(r), is (the log of) a distributed lagged function of real 
past research grants and contract income by scientific micro fields and X, is a 
vector of the control variables described below. As explained above, a six-year 
lag for publications and graduate students, and a seven-year lag for citations 
were applied; then, conditional on them, we tested the shape of the lag 
function using fourth, third and second degree polynomial functions. In all 
cases we could not reject that the third degree polynomial function was the 
correct one. Also in all cases we tested an unconstrained model and could not 
reject the constrained model as valid. 

The vector X, refers to a series of control variables included to assess two 
important phenomena. 

— First, we want to control for the way in which time is allocated by the 
researchers. One of the most important decisions regarding time for many 
(but not all) university researchers is how it is allocated between research and 
(undergraduate) teaching activities. Because we have information about the 
number of undergraduate students by field and year, we can control for the 
impact on research output of teaching intensity in the different fields. 

— Second, research output can be affected by factors specific to the uni- 
versity (Geuna 1999). We test for three effects: a) localisation (London based 
universities); b) research propensity (Russell group universities versus Group 
94 universities); and c) reputation (when the university was founded). 


The control variables are as follows: 

— Teaching Load: is the ratio of undergraduate students to total staff, 
computed by field and year.’ 

— London: refers to the proportion of research income in each field that is 
invested in universities located in London. 


8 Information on teaching intensity ratio is only available from 1993. As the estimation 
sample starts in 1989, we had to reconstruct the ‘missing’ period. The best imputing 
mechanism was using university level linear interpolation, which respects the hetero- 
geneity across universities and fields. 
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— Russell: refers to the proportion of research income in each field that is 
spent in universities affiliated to the Russell Group (self-selected group of 
research-led universities). 

- Group 94: is the proportion of research income in each field that is spent 
in universities that belong to the Group 94 (self-selected group of research-led 
universities that are, on average, smaller than the Russell Group, more ori- 
ented to teaching, and with less prestigious research reputations). 

— Medieval Universities: is the proportion of research income in each field 
allocated to universities founded before the 18" century. 

— 19" Century Universities: is the proportion of research income in each 
field allocated to universities founded in the 19" century. 

— 20" Century Universities: is the proportion of research income in each 
field allocated to universities founded in the first half of the last century. 

— Post WWII universities: is the proportion of research income in each field 
spent in universities founded after the Second World War, mostly redbrick 
universities. 


The coefficients of these control variables capture, to some extent, the dif- 
ferences in research productivity of the various institutions. The available 
literature on university research production allows us to hypothesise a neg- 
ative coefficient for the undergraduate teaching variable: we can expect a 
negative impact on research production due to the allocation of more time to 
undergraduate teaching activities. The localisation of universities in the 
London area should create positive externalities for research, which increases 
the productivity of those institutions located in London. We expect a positive 
value for the variable London. With regard to the other control variables no 
clear a priori expectation can be formulated; to our knowledge this is the first 
study that has attempted to evaluate these effects. A possible hypothesis is 
that those universities that are more research-led and more prestigious tend to 
assign more importance, and therefore more support, to research, which 
should translate into higher research productivity. 

We estimate the model for the three research outputs: publications; cita- 
tions; and number of graduate students. 


3.1 Publications 


We first show the pattern of weights and then proceed to the results of the 
model. As is clear from Figure 1 (see p. 80), a first important result of our 
estimation is that the lag structures are significantly different across fields. The 
social sciences show a relatively important impact in the short run (during the 
first two years) but the effects diminish over time; the situation in the natural 
and medical sciences is the reverse, the bulk of the impact being concentrated 
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Figure 1 Restricted Pattern of Weights (Publications), by fields 
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towards the end of the lag span. Finally, in the case of engineering we have a 
clear parabolic function, which suggests a concentration of impact towards the 
middle of the time period. These differences in the weighting function are 
very important because they point to a differential impact of a given increase 
in the science budget over time. The research output generated in the social 
sciences is much more immediate than in the other sciences, leading to an 
increase in the share of socials sciences in total publications in the short run. 
This situation is reversed over time in favour of the natural and medical sci- 
ences. 

Table 2 presents the results from using the described weighting pattern to 
compute the sector knowledge stock and to estimate model (6). The first 
interesting result is that the long run elasticity between knowledge stock and 
publications varies widely across broadly defined fields. The highest elasticity 
is found in the medical sciences (0.46) and the lowest in the natural sciences 
(0.20). In all four cases elasticities are significant. The year effect, which 
captures the long run trend in scientific opportunities affecting research 
output, is always positive. It is important to note that as this model does not 
include a specific variable for spillovers from abroad (an international co- 
authorship matrix in each science field would be needed) the time trend also 
captures international spillover effects. The year trend value is highest for 
engineering, and smallest for the medical sciences. 

In terms of the impact distribution of changes in the research budget, the 
last two rows of Table 2 show the median lag (the year that accumulated at 
least 50% of the impact) and the 90 percentile lag. Consistent with the 
weight patterns, the most immediate impact is in the social sciences where 
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Table2 UK Levels Estimates, Publications (Method: field fixed effect) 


NS ENG MS SS 
R&D 0.208 0.216 0.461 0.340 
0.112* 0.132* 0.145*** 0.086*** 
Year 0.014 0.036 0.011 0.033 
0.007* 0.009*** 0.009 0.006*** 
Undergraduate Teaching — 0.032 — 0.014 — 0.052 — 0.017 
0.010*** 0.014 0.009*** 0.007** 
London 0.001 — 0.012 0.003 — 0.001 
0.003 0.003 0.004 0.005* 
Group94 0.001 0.004 0.003 — 0.008 
0.004 0.005** 0.003* 0.005 
Russell 0.004 — 0.004 0.006 0.000 
0.003 0.003 0.003 0.005 
Medieval 0.002 0.005 — 0.017 0.013 
0.005 0.004 0.007** 0.004*** 
19" Century 0.001 0.007 — 0.008 0.009 
0.004 0.003** 0.004* 0.004** 
20' Century 0.008 0.005 — 0.001 0.018 
0.005 0.007 0.004 0.012 
Constant — 21.630 — 67.578 — 19.889 — 64.677 
12.254* 15.157%** 16.129 11.184*** 
Observations 108 84 72 84 
R-squared 0.83 0.78 0.88 0.86 
Chi(2) 2.97 7.10 7.75 7.44 
P > Chi(2) 0.71 0.21 0.17 0.19 
50% Quartile Lag (years) 3.8 2.1 4.0 1.1 
90% Quartile Lag (years) 55 4.6 5.6 3.1 


Robust standard errors reported below each coefficient. Within R-squared reported. 
(*) significant at 10%; (**) significant at 5%; (***) significant at 1% 


90% of the effect is observed after 3 years, compared to the medical sciences 
where it is 5.5 years before 90% of the effect is seen. 

We obtained statistically significant and important coefficients for some of 
the control variables. First, the variable capturing teaching load is statistically 
significant and important for all fields except engineering. The coefficient is 
always negative, confirming that large undergraduate teaching loads have a 
disruptive effect on scientific production. The biggest effect is in the medical 
sciences. In this case, an increase of one additional undergraduate student per 
research staff member has the effect of reducing research output by about 5%. 
Second, and rather surprisingly, a higher allocation of funds to London based 
universities results in a slightly less productive system in the social sciences. 
Third, also contrary to expectation, we found some evidence to support the 
view that a bigger allocation of funds to the Group 94 universities would result 
in an overall higher research output in engineering and the medical sciences; 
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no significant effect was identified for the Russell Group universities.” Fourth, 
a larger share of funds to the Medieval universities was shown to result in a 
more productive system in the social sciences, but has a negative impact on 
the medical sciences. A larger share of funds to the 19" Century universities 
had a positive effect on engineering and the social sciences research output, 
and a negative impact on the medical sciences. The comparator groups for 
university history is the group of universities founded after WWII. Finally, the 
tests for validity of the constraints were never rejected. 


3.2 Citations 


Citation output was analysed following the procedure used for publications. 
Figure 2 shows the pattern of weights for the different disciplines. The results 
appear similar to those for publications. The citation output tends to respond 
more quickly to an increase in R&D investment in the social sciences than the 
other scientific fields. The medical sciences shows its largest research impact 
only at the end of the time period, while the polarisation is less strong for the 
natural sciences and engineering. The main difference from the publication 
lag structure results is the very similar pattern for engineering and the natural 
sciences: engineering has a less symmetric profile and behaves much more like 
the natural sciences. 

Table 3 (see p. 84) presents the results for the estimation of model (5) in the 
case of citations output. In terms of long run science budget elasticity the 
results are very similar to the results for publications. The highest elasticity is 
found in the medical sciences (0.61), while the lowest is in engineering (0.15), 
which is non-significant. The time trend variable is always positive and sig- 
nificant in three of the fields, once again pointing to an increase in scientific 
opportunities and the existence of international spillovers. Regarding the 
impact distribution of changes in the research budget, the earliest impact is in 
the social sciences where 90% of the effect is observed after about 4 years, 
while in the medical sciences 90% of the effect is achieved only after 6 years. 

Regarding the remaining control variables, the results tend to be consistent 
with those for publications, with the exception of the control variable for the 
Russell Group universities: a larger allocation to Group 94 universities does 
not have a positive effect on the system output, while a higher share of funds 
to Russell Group universities has a positive impact on citations in the medical 
sciences, but a negative impact in engineering. The teaching variable is again 


° The higher output could be due to two phenomena: higher productivity of the Group 
94 universities, or the competition effect from the other universities which received less 
funds. University level micro data would be needed to disentangle these two effects. This 
reasoning applies to the other resource allocation control variables. 
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Figure 2 Restricted Pattern of Weights (Citations), by fields 
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negative, with the highest absolute value in the medical sciences. A bigger 
allocation to London based universities does not provide any positive 
advantage. Larger proportions of direct funding to Medieval universities 
result in a higher citations output in the social sciences and lower returns in 
the medical sciences, similar to the 19™ Century universities, with the adjunct 


Figure 3 Restricted Pattern of Weights (Graduate Students), by fields 
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Table3 UK Levels Estimates, Citations (Method: field fixed effect) 


NS ENG MS ss 
R&D 0.212 0.146 0.617 0.353 
0.123* 0.196 0.160*** 0.095*** 
Year 0.014 0.037 0.005 0.032 
0.008* 0.009*** 0.011 0.007*** 
Undergraduate Teaching — 0.031 — 0.018 — 0.059 — 0.016 
0.010*** 0.014 0.012*** 0.007** 
London 0.001 — 0.011 0.007 — 0.008 
0.003 0.003 0.003 0.005 
Group94 0.001 0.004 0.003 0.001 
0.004 0.003 0.003 0.005 
Russell 0.004 — 0.004 0.004 0.000 
0.003 0.005** 0.004* 0.005 
Medieval 0.002 0.005 — 0.018 0.013 
0.005 0.005 0.007*** 0.005*** 
19" Century 0.001 0.006 — 0.008 0.009 
0.004 0.003** 0.004* 0.004** 
20' Century 0.008 0.004 0.002 0.018 
0.005 0.007 0.006 0.012 
Constant — 22.493 — 67.690 — 10.691 — 63.943 
12.946* 16.240*** 18.718 12.118*** 
Observations 108 84 66 84 
R-squared 0.68 0.77 0.84 0.67 
Chi(2) 1.398 2.760 8.635 7.160 
P > Chi(2) 0.966 0.838 0.195 0.306 
50% Quartile Lag (years) 4.4 3.3 4.3 1.6 
90% Quartile Lag (years) 55 5.3 6.0 3.9 


Robust standard errors reported below each coefficient. Within R-squared reported. 
(*) significant at 10%; (**) significant at 5%; (***) significant at 1% 


of a positive impact for engineering. Finally, as before, the constraints implied 
by the model were not rejected. 


3.3 Graduate students 


The third science output we examined at the field level for the UK is the 
‘production’ of graduate students. Figure 3 (see p. 83) shows the lag structure. 
The most interesting result of this analysis is that the patterns appear quite 
different from the patterns for publications and citations. This result might be 
because graduate students are a research output of a completely different 
nature to publications and citations. In the case of graduate students, the 
medical sciences, engineering and the natural sciences show the strongest 
impact quite quickly (in the first three years), while the impact in the social 
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sciences does not become evident until towards the end of the time frame. We 
do not have a definitive explanation for this result, but we incline to the view 
that the combination of a different mix of graduate courses (MSc, Mphil and 
PhD) across the different fields might be generating these sorts of differential 
impacts. 

Table 4 (see p. 86) shows the results of estimating model (6) using field fixed 
effects. The largest elasticities regarding this type of research output are found 
in the natural and the medical sciences with values of 0.54 and 0.65 respec- 
tively. The corresponding elasticities for the social sciences (0.21) and engi- 
neering (0.11 and non-significant) are much lower. The time trend had a 
positive coefficient in all fields except the natural sciences. This points to an 
increase in productivity in the social and the medical sciences and engineering 
regarding the ‘production’ of graduate students. In terms of the impact dis- 
tribution of changes in the research budget, the most immediate impact is in 
engineering, where 90% of the effect is observed after about 3 years, while the 
most delayed impact is in the social sciences where it takes 5.3 years for 90% 
of the effect to be felt. 

Regarding the remaining control variables there are some interesting 
results. First, the undergraduate teaching variable is negative in the natural 
sciences, the medical sciences and engineering, pointing to the fact that in 
these fields an increase in the undergraduate teaching load negatively affects 
the time allocated to supervising and guiding graduate students. In contrast, in 
the social sciences we have a positive impact from undergraduate teaching 
towards graduate teaching, pointing to an apparently different nature of 
graduate studies in this field, a possibility that requires much more analysis for 
it to be confirmed. Second, as before, there was no evidence of a positive 
localisation effect for a higher allocation of grants and contracts to London 
based universities. Third, a bigger allocation of funds to the universities 
belonging to the Group 94 had a positive premium in engineering, while for 
those in the Russell Group the biggest premium was in the social sciences. 
Fourth, in terms of age, a higher share of funds to the Medieval universities 
has a positive effect in the social and the medical sciences, while more funding 
to the 19 Century universities induces an increase in the university system 
output in the medical sciences, but a decrease in the natural sciences. This last 
result also applies to the 20" Century universities, which also showed a pos- 
itive premium in the social sciences. The comparator group, as in the previous 
two estimations, was the category of the Post WWII universities. Finally, as 
before, in all the models the constraints were not rejected. 

Field level estimates provide us with an interesting set of results. Most of 
these are novel to the literature on the economics of science and, thus, should 
be seen as preliminary and exploratory, to be confirmed by further analyses. 
First, in the case of the medical science, the social sciences, and the natural 
sciences we can identify positive and significant returns for publications, 
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Table4 UK Levels Estimates, Graduate Students (Method: field fixed effect) 


NS ENG MS ss 
R&D 0.542 0.107 0.656 0.214 
0.200*** 0.173 0.158*** 0.091** 
year — 0.024 0.027 0.044 0.064 
0.011** 0.009*** 0.011*** 0.007*** 
Undergraduate Teaching — 0.062 — 0.044 — 0.024 0.015 
0.015*** 0.020** 0.010** 0.006*** 
London 0.002 0.009 — 0.01 — 0.005 
0.004 0.004 0.004 0.003 
Group94 0.003 0.003 0.005 — 0.005 
0.006 0.005* 0.006 0.002 
Russell 0.005 — 0.003 — 0.009 0.003 
0.004 0.005 0.006 0.002* 
Medieval — 0.015 0.004 0.028 0.006 
0.009 0.005 0.014* 0.004* 
19" Century — 0.014 — 0.005 0.017 0.003 
0.007** 0.004 0.008** 0.004 
20' Century — 0.014 0.008 0.002 0.015 
0.008* 0.007 0.003 0.007** 
Constant 47.113 — 48.338 — 93.097 — 124.08 
18.987** 16.708*** 19.1547** 13.174*** 
Observations 99 77 66 TI 
R — squared 0.70 0.65 0.87 0.92 
Chi(2) 3.005 0.635 10.952 9.803 
P > Chi(2) 0.699 0.986 0.052 0.081 
50% Quartile Lag (years) 13 0.8 11 3.8 
90% Quartile Lag (years) 3.5 2.8 3.1 53 


Robust standard errors reported below each coefficient. Within R-squared reported. 
(*) significant at 10%; (**) significant at 5%; (***) significant at 1% 


citations, and graduate students from investment in higher education R&D. 
Although positive, the effect for engineering is only significant in the case of 
publications, pointing to the fact that the research output from this scientific 
field is better captured by measures other than citations and research students. 
Second, the four scientific fields tend to have different lag structures. This is 
particularly noticeable in the case of the social sciences. While investment in 
R&D in the social sciences affects publications and citations more immedi- 
ately than in the other three fields, in the case of gradate students most of the 
returns to research grant and contract funding are concentrated at the end of 
the period. Third, we found strong evidence that a high undergraduate 
teaching load negatively affects the research outputs of UK universities. Only 
in the case of graduate students in the social sciences did we find a positive 
effect. Fourth, we constructed a set of control variables to assess the impor- 
tance of allocation of grants and contracts to different subgroups of uni- 


The Productivity of UK Universities 87 


versities (university specific effects). Some of these were significant and 
important, pointing to the fact that different allocations of funds to uni- 
versities result in higher or lower university system scientific production. The 
higher or lower output may be due to higher productivity in the institutions 
that received more grants and contracts or a competition effect from the 
universities that received less funds. Micro data at the level of the university 
would be needed to identify which of the two effects is dominant. 


4 The UK knowledge productivity analysis 


This section focuses on the efficiency with which the domestic stock of 
knowledge (science budget and other grants and contracts) is applied in order 
to generate the different research outputs. Has this efficiency grown over time 
or has it declined across disciplines? Building on the results in section 3 we 
computed field specific total factor productivities (TFPs). These TFPs capture 
the evolution of the scientific opportunities in each field, and also the effects 
of changes in organisational practices, resources allocation, and management. 

For each macro field we computed the residual of the knowledge pro- 
duction function (6) as: 


(6) to’ = yE-B" Wir), i=1,....N; F=1,..J 


where tfp, is the knowledge production function (semi) residual after con- 
trolling for changes in W(r),,, the distributed lagged function of real past R&D 
expenditures. In order to compute (6) we first need an estimation of the 
elasticity coefficients by field (the Ps). We use the field level results shown in 
Tables 2 (see p. 81) to 4. Given the lags used in the construction of W(r), we 
can only focus on productivity evolution during the 1990s. 

Figures 4 (see p. 86), 5 and 6 (see p. 90/ 91) show the evolution of the TFP 
index by field over time for each of the research outputs. Two clear patterns 
emerge. In all macro fields and research outputs there is an upward trend in 
the productivity indices, suggesting that there is a clear improvement in the 
efficiency and technological opportunities of the system. In all four major 
scientific fields and for the three traditional outputs of scientific research, the 
productivity of UK science has increased along the 1990s. However, from the 
mid 1990s, in all the macro fields, there has been a marked slowdown in 
productivity growth rates as highlighted by the less steep slopes of the pro- 
ductivity indices at the right of the figures. 

Across the whole period the TFP growth rate in the case of publications has 
fluctuated between 1.2% and 2.4%, with the lowest value in the natural sci- 
ences and the highest in the social sciences. Taking the cut-off point of 1996 
(chosen to coincide with the 1996 RAE), the average TFP growth rates during 
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Figure 4 TFP Publications 
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the first half of the 1990s was compared to the same indicator for the 1990s to 
2001. The data show a remarkable slowdown in productivity, TFP productivity 
growth rates declined by more than 50% in the natural sciences, engineering 
(the largest decrease) and the social sciences, but by ‘only’ 22% in the medical 
sciences. Numbers of citations show a similar profile. The highest growth rate 
is in engineering (2.4%) and the lowest in the medical sciences (0.8% ). There 
is also a clear slowdown in productivity growth rates, but the degree of the 
decline is even greater than in publications. Finally, the results for graduate 
students are similar to those for citations with the exception that the highest 
growth rate occurs in the medical sciences. The slowdown in the second half of 
the 1990s is also remarkable: in engineering and the natural sciences TFP 
growth rates halved, while in the social and the medical sciences TFP growth 
rates are 60% of their value in the previous period. 

It is important to note that the productivity slowdown is not an artefact of 
the increased spending in UK science. The real increase in science and 
engineering R&D spending in the UK started in 2000-01. In our model the 
impact on research outputs of an increase of about 7% in 2000-01 is spread 
across the succeeding six to seven years; the weight for the first year is small in 
the case of publications and citations (lower than 10% for all except the social 
sciences) and about 25% in the case of graduate students (again excepting the 
social sciences for which it is near zero). A significant increase in R&D 
spending in a particular year can negatively affect the overall productivity of 
the system in that year if a simple productivity measure based on the ratio 
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Figure 5 TFP Citations 
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Figure 6 TFP Graduate Students 
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between that year’s inputs and outputs is considered. Our measure of pro- 
ductivity refers to changes in research output that are not explained by 
changes in the stock of scientific knowledge as proxied by current and past 
R&D spending. Our estimation of stock of knowledge already controls for the 
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Table5 TFP growth rate decompositions for the natural sciences and 
engineering 


Time Natural sciences Engineering 


A B C D TOTAL A B C D TOTAL 


91-96 15 1.4 1.4 1:2 1.1 3.0 2.8 3.1 2.8 2.7 
96-01 0.7 0.8 0.7 0.6 0.6 1.3 1.2 1.4 1:2 1.3 
Total 1.1 1.1 1.1 1.0 0.9 2.2 2.1 2.3 2.1 2.1 


Note: A controls only for R&D spending; B is A plus controlling for resources allocation 
by London, Group 94 and Russell Group; C is A plus controlling for University Age; D is 
A plus controlling for teaching intensity; and Total is A plus (B+C+D). 


fact that there are some adjustment lags and that a given increase, for 
example, in the science budget, is not going to have an immediate impact on 
research outputs. In the case of a traditional productivity measure, such as the 
ratio between papers and HERD, the UK has witnessed a very clear decline in 
the 1990s due to the significant increase in the science budget and not to a 
deterioration in the performance of the system (Evidence 2003). Our measure 
of productivity controls to some extent for this and tries to capture organ- 
isational or managerial changes in the system. 

The TFP estimations above take account only of the spending on research 
grants and contracts. We now introduce the other control variables to see 
whether they explain the productivity slowdown. There are several different, 
and overlapping, explanations. One is that in the period 1996-2001 the dis- 
tribution of higher education funding led to the system being less productive 
within each scientific field (B and C estimations). Another is that increased 
enrolment rates at undergraduate level were not compensated for by an 
equivalent increase in staff, leading to a reduction in available research time 
(D estimation). To investigate these two possibilities we re-estimated the TFP 
indices controlling for how resources are allocated across types of institutions 
and for teaching intensity ratio. The results for publications are presented in 
Tables 5 and 6. 

Two trends emerge from Tables 5 and 6. First, at field level the process of 
resource allocation has no serious impact on productivity growth because 
controlling (or not) for how resources are distributed across university types 
and geographical location (columns B and C compared to column A) only 
marginally affects average productivity growth. The exception is the social 
sciences where the distribution of higher education funding in the first period 
compared to the distribution in the second period, which led to the system 
being less productive, reduces the unexplained productivity slowdown (for 
example, in column A the difference between the two time periods is 1.6; in 
column C the difference is 1.2). 
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Table6 TFP growth rate decompositions for the medical and social sciences 


Time Medical sciences Social sciences 


A B C D TOTAL A B Cc D TOTAL 


91-96 2.0 2.0 1.9 1.3 1.2 32 3:2 PS 3.2 2.8 
96-01 1.4 1.3 1.3 1.4 1.2 1.6 1.8 1.5 1.7 1.8 
Total 1.7 1.7 1.6 1.3 1:2 2.4 2.6 2.1 2.5 2.3 


Note: A controls only for R&D spending; B is A plus controlling for resources allocation 
by London, Group 94 and Russell Group; C is A plus controlling for University Age; D is 
A plus controlling for teaching intensity; and Total is A plus (B + C + D). 


The results controlling for teaching intensity are similar and, again, are 
relatively invariant. The exception is the medical sciences where, after con- 
trolling for teaching intensity, productivity growth rates reduce from 1.7% to 
1.2% (row total) and the two sub-periods show no productivity slowdown. 
Interestingly, after controlling for teaching intensity TFP in the first period 
drops to 1.3, pointing to the fact that the reduction in teaching intensity in this 
discipline actually contributed to the higher productivity during the first time 
period.!? For the other research outputs the conclusions are similar to those 
for publications. 

Controlling for research allocation and teaching intensity partially explains 
the productivity slowdown in the medical sciences (especially in the case of 
publications), but does not account for the productivity slowdown in the 
second half of the 1990s for the other scientific fields. 

There are four possible reasons for this unexplained slowdown. First, there 
might have been a deterioration in the organisational efficiency of production 
of traditional science outputs within each field (and even within departments) 
due, for example, to the creation of incentives for the development of third 
stream type activities. Second, there might have been a reduction in human 
capital (the quality of labour), i.e., in the research staff. Underlying this 
hypothesis is the possibility that the lag in the relative compensations paid to 
researchers in the universities could have led to some high skilled staff leaving 
academia (for positions overseas or for jobs in industry), being replaced by an 
equivalent number of lower quality personnel. Third, due to the increase in 
other countries’ publishing in English, UK researchers are facing increased 


10 Student to staff ratios in the medical sciences decreased from 8.4 to 7.4 students per 
staff across the whole period. A more detailed inspection shows that any decrease was 
mostly during the first sub-period. The ratio of students to staff in 1991-95 declined 
annually from 8.4 in 1991 to 6.9 in 1995. This variable was more volatile in the second 
sub-period oscillating between 6.9 and 7.3. 
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competition for publication in ISI journals, raising the bar to getting published 
(a quality effect). !! 

All of these are pessimistic explanations for the productivity growth slow- 
down. There is a fourth possibility, which is more optimistic, which is that the 
RAE has an impact. We can think of the RAE as a sort of institutional shock 
in the research incentive system for academic units. That is, the introduction of 
the RAE at the end of the 1980s/beginning of the 1990s produced a positive 
shock, which induced a productivity increase on the part of UK scientists. If 
this shock were affecting productivity levels rather than growth rates, after a 
transition period the system would return to its average growth rate. In other 
words, the effect of the RAE may have been more dramatic in the early 1990s, 
but subsequently declined. This could explain the productivity slowdown in 
the second sub-period considered in our analysis. 

It is very difficult to identify which of these potential explanations is the 
most relevant. Alternative models based on micro data at the university and 
unit of assessment levels could help to clarify the current dynamics of the UK 
science system. 


5 Conclusions 


This paper has analysed the determinants of the three most common uni- 
versity research outputs: publications (as a proxy for the production of codi- 
fied research knowledge); citations (as an impact adjusted proxy for codified 
research production); and Masters and PhDs awarded (as a proxy for the 
production of tacit knowledge accumulated in human capital) for the UK 
case. 

The analysis of the UK science system as represented by the old universities 
(which account for about 90% of R&D expenditure) points to the existence of 
different science production functions. We rejected the model of a global 
science production function for the UK in favour of four very broadly defined 
macro-fields: the medical sciences, the social sciences, the natural sciences and 
engineering. In each of these fields either the weight patterns or the R&D 
elasticities (and also some of the coefficients of control variables) were sig- 
nificantly different. 

For publications and citations we estimated significantly different lag 
structures, with a long lag for the medical sciences before full returns from an 
increase in R&D spending were achieved, but the social sciences seeing 
results in the first few years. This means that the science system does not 


1! There is some evidence of this phenomenon in the discussion in the New York Times 
(May 3, 2004) about the loss of dominance of the US in the sciences to non-English 
speaking countries. 
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respond uniformly to changes in funds. For example, an increase in the overall 
science budget will have a rather sequential impact: first, the changes will be 
felt mainly in the social sciences, then in engineering and the natural sciences 
and finally, in the medical sciences. For graduate student research output the 
results are different, with the short term impact being concentrated in engi- 
neering and the long term impact in the social sciences. 

In the case of the medical sciences, the social sciences and the natural sci- 
ences we identified positive and significant returns for publications, citations 
and graduate students from investment in higher education R&D. Although 
positive the effect in engineering is only significant in the case of publications, 
pointing to the fact that the research output of this scientific field is better 
captured by other measures than citations and research students. 

We included in the models a set of control variables. We found strong 
evidence that a large undergraduate teaching load negatively affects the 
research outputs in UK universities. Only in the case of graduate students in 
the social sciences did we see a positive effect. Overall, the higher the teaching 
load the lower the research productivity. This result denies the validity of the 
policy model followed in the 1980s and 1990s, which assumed that the number 
of students per lecturer could be increased without a decrease in the overall 
quality of the HE system. 

We also controlled for the impact of different allocations of funding across 
types of institutions; the results are mixed and vary according to the different 
research outputs. Some were significant pointing to the fact that different 
allocations of funds to universities result in higher or lower university system 
scientific production. Due to the limitations of field level data, the results on 
university specific factors, though interesting, should be considered as pre- 
liminary: they require validation through analyses based on micro data. 

Finally, we developed an analysis of the productivity of UK science and the 
changes in it during the 1990s. UK TFP has grown across the whole period. 
This result contrasts with the most standard publication per HERD measure 
of productivity, which presents a remarkable drop in British productivity, 
mainly due to a combination of increased budget and publication lags. 
However, we also identified a clear slowdown in TFP growth in the second 
half of the 1990s compared to the first. This decline is not due to an increase in 
the research spending in the later period, nor to the way that resources were 
allocated across institutions (although this did have some effect in the medical 
sciences and the social sciences), nor to an increase in teaching loads (which 
were fairly static in the second half of the 1990s). We speculate that this 
slowdown in productivity is due to mainly unobserved systemic effects (a 
policy shock during the first half of the 1990s such as the RAE) or very micro 
factors related to the (relative) reduction in researchers’ rewards, the intro- 
duction of more transferable research or a ‘brain drain’ of high skilled 
researchers. This slowdown can also be ascribed to increased competition for 
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publication in ISI journals from overseas research. Without more micro data it 
is not possible to tease out from these alternative explanations their relative 
importance. These results are consistent with the results of international 
analysis, which point to a decrease in the relative productivity of UK science. 
Indeed, it is possible to envisage that, during the 1990s, UK science showed 
positive productivity growth, but that this growth was less marked than in 
other countries, especially in the second half of the 1990s. 

This paper aimed to test the feasibility of using econometric models to 
produce results that could contribute to the development of science policy, the 
aim being not to produce exact indicators of the dynamics of the science 
system, on the basis of which to draw strong policy conclusions. Rather, the 
inherent shortcomings in the measurement of the output (and ultimately of 
the outcomes) of the scientific activity, and the limitations on the available 
input data call for extreme caution in the interpretations of our results. The 
conclusions presented above should be taken as a first and preliminary 
attempt to develop a better understanding of the relationship between the 
allocation of resources and scientific research output. 
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Comment by 


CHRISTIAN PIERDZIOCH 


In an integrating world economy, scientific research leading to the invention 
of new products and the development of unconventional ideas forms the basis 
of the international competitiveness of firms, industries, and entire economies. 
It is for this reason that it has often been emphasized in academic research 
and in policy circles that developing a deeper understanding of the factors that 
affect the quantity and quality of research output is of key importance for the 
prosperity and international competitiveness of economies. One factor that 
may affect research output is the availability of financial funds. But how 
exactly does the availability of financial funds affect research output? How to 
measure research output? Are the effects of the availability of financial funds 
on research output different across disciplines? Does the availability of 
financial funds affect research output immediately or only with a time lag? 
These are all important questions, and finding answers to these questions is a 
difficult task. Gustavo Crespi and Aldo Geuna have used a novel database on 
the productivity of scientific research at UK universities to empirically tackle 
this task. Their empirical analysis is highly welcome because it yields inter- 
esting insights into the factors that affect research output, and because it has 
been competently and thoroughly done. 

In order to conduct their empirical analysis, Crespi and Geuna have col- 
lected data on the productivity of scientific research at 52 UK universities. 
Their data cover the period 1984-2002. They present results for four major 
fields of science: engineering, the natural sciences, the medical sciences, and 
the social sciences. They have used the number of publications, the number of 
citations, and the number of graduate students to measure research output. In 
their empirical analysis, they have used a production function to link their 
measures of research output to expenditure on research and development 
(R&D). In order to capture potential time lags between expenditure on R&D 
and research output, Crespi and Geuna have estimated a polynomial dis- 
tributed lag model. They have estimated this model using techniques available 
for estimating panel data models. Their model contains a number of control 
variables, including a measure of the undergraduate teaching load and 
measures of the localization and reputation of a university. 

Crespi and Geuna present a number of interesting arguments and results, 
and every single argument and result deserves to be discussed in detail. In the 
following, I shall focus on potential problems that may arise in the meas- 
urement of research output, the specification of the production function, and 
the interpretation of the empirical results. 
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As concerns the measurement of research output, one problem is that 
counting the numbers of publications is maybe a good indicator of the 
quantity of research output, but not necessarily of the quality of research 
output. For example, not all academic journals are highly ranked “top” 
journals, and getting a paper published in an international “top” journal is 
much more difficult than getting it published in a national journal or a highly 
specialized field journal. For this reason, the scientific community has 
developed sophisticated ranking schemes in order to capture the importance 
and the impact of academic journals. In consequence, it would be interesting 
to analyze how the results Gustavo Crespi and Aldo Geuna report would 
change when rankings of journals were used to weigh publications. Rankings 
of journals may also be useful as a weighting scheme for citations because it 
could make a difference whether a research paper is mainly cited, for 
example, in a specialized field journal or a general interest journal. 

As concerns the specification of the production function, it would be 
interesting to learn more about potential problems caused by omitted varia- 
bles and the potential influence of control variables different from those used 
by Crespi and Geuna. As concerns potential problems caused by omitted 
variables, it could be the case that the positive link between research output 
and expenditure on R&D reported in the paper is at least in part due to the 
influence of a third variable not yet included in the empirical model. One such 
variable could be a measure of the stance of the business cycle. For example, 
in a business cycle boom, tax revenues and, because of a stock market boom, 
the budgets of private foundations increase. This may lead to an increase in 
expenditure on R&D. At the same time, expenditure on R&D by firms is 
likely to increase, firms may hire researchers, and the salaries paid by firms 
may also increase. This could strengthen the competition between firms and 
universities for researchers, resulting in an increase in research output. 

As concerns control variables, it would be interesting to include, for 
example, the number of researchers per field, the number of research semi- 
nars and scientific conferences that took place at a university, and the number 
of visiting researchers as control variables in the production function. These 
variables may give a good account of the reputation of a university. Moreover, 
these variables may proxy the quality of the research environment at a uni- 
versity. Of course, collecting data on these variables could turn out to be very 
difficult. Given that path dependencies may play an important role for the 
reputation of a university and the quality of the research environment, it could 
be interesting to use lagged research output as a control variable in the pro- 
duction function. 

As regards the specification of the production function, it would also be 
interesting to learn more about the interpretation and the statistical properties 
of the explanatory variables. For example, the authors have included a time 
trend in the vector of explanatory variables, and they argue that the time trend 
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captures spillovers from abroad. It would be interesting to learn more about 
these spillovers from abroad. Do they represent the international exchange of 
ideas? Or do they represent the importance of forming international research 
teams? One could also ask whether the time trend reflects spillovers from 
abroad or rather captures structural breaks or stochastic trends in the data. 

As concerns the interpretation of their empirical results, the authors mainly 
focus on the slowdown in productivity of research at UK universities that took 
place at the end of the 1990s. This certainly is an important result that 
deserves a comprehensive and thoughtful analysis. The authors, however, 
report many more interesting results, and it would be interesting to learn more 
about how these results can be interpreted. For example, Crespi and Geuna 
report that, regarding the number of publications, the effects of expenditure 
on R&D are very different across disciplines. Natural questions that arise as 
regards this result are: Why are there differences across disciplines? Why is it 
important to learn more about differences across disciplines? Are there any 
policy implications? Should we spend more or less money on R&D at uni- 
versities? Is the allocation of expenditure on R&D across disciplines optimal? 
It is impossible to answer all these questions in one paper. However, I suggest 
that the paper could benefit from considering one or the other of these 
questions. 

To sum up, Crespi and Geuna have done very interesting research, and they 
have undertaken their empirical research with care. Their paper is concisely 
written, and I have learnt a lot from reading it. I believe that their paper will 
stimulate future research. 


Evaluation of Researchers: 
A Life Cycle Analysis of German Academic Economists 


by 
MICHAEL RAUBER and HEINRICH W. URSPRUNG* 


1 Introduction 


Evaluations compare certain features of a person with the features observed 
in a group of peers. A worthwhile evaluation needs to explicitly define the 
relevant comparison group and to make a case for the employed choice. In 
many cases, the contemporaries of the person to be evaluated represent the 
relevant peer group, the best example being the standard IQ test whose name 
even refers to the fact that intelligence is measured in relation to some 
denominator, which is, of course, the respective person’s age. In sports, where 
evaluation almost represents the raison d’étre, it is also quite common to 
compare contestants of the same age group, but other comparison groups, 
based, for example, on body weight or professional status, are also widely 
employed. 

Research evaluations that are based on scientometric methods are still 
surrounded by a touch of controversy. Nevertheless, it is generally accepted 
that reasonable scientometric evaluations need to focus on narrowly defined 
disciplines; how the disciplines should be delineated is, of course, another 
matter. Many scientometric studies are, moreover, restricted to specific geo- 
graphic regions and types of institutions. Apart from these public-domain 
characteristics, the relevant peer group is also described by personal charac- 
teristics, arguably the most important one being the researcher’s age. 

Age features two distinct dimensions that are relevant in the evaluation 
context: vintage and career age. Both of these dimensions are liable to have a 
strong impact on research productivity because research production heavily 
relies on human capital that is determined, on the one hand, by the initial 
endowment (i.e., by ability and initial training) and, on the other hand, by 
experience and obsolescence of knowledge. Since initial training (graduate 
education) is related to the age cohort, whereas experience and obsolescence 
of knowledge are related to career age, both of these age dimensions represent 
personal characteristics that are associated with generally recognized peer 
groups (class of 2005, assistant professors in their sixth year, etc). 

Precisely because life-cycle and vintage effects are liable to influence any 
researcher’s productivity, research evaluations which are undertaken to 


* We thank Robert Hofmeister and Philipp Stützle for valuable research assistance. 


102 Michael Rauber and Heinrich W. Ursprung 


implement incentive-compatible managerial reward or penalty schemes need 
to take these age dimensions into account. In principle, this statement is not 
controversial. Tenure and promotion committees have always compared the 
track records of the applicants with precedents. Alternatively, they have 
judged whether the track records are compatible with an established policy or 
standard. These standards, however, have evolved over time by investigating 
research oeuvres of applicants who, by the very fact that they aspired to take a 
certain career step, constitute a peer group defined by career age. Decisions 
with respect to performance-related pay have likewise been based on com- 
parisons of track records. Since remuneration, unlike tenure and rank, does 
not represent a time-invariant prize, the applicant’s age at the time of the 
application, i.e., his or her cohort or vintage, is always implicitly taken into 
account by the responsible authorities. 

Even though of great importance for management decisions, studies dealing 
with the evaluation of economic research have hitherto rather neglected the 
age dimensions. This neglect applies especially to studies that evaluate entire 
groups of researchers, for example university departments or research insti- 
tutes. An exception is the ranking study by Combes and Linnemer (2003). 
These authors, who rank 600 economic research institutions from 14 European 
countries, present, among others, one research productivity index that takes 
the respective researcher’s career age into account. Even though the 
employed method of normalization with respect to career age is purely ad hoc, 
and the career age of the economists is estimated by rule of thumb, this study 
is groundbreaking because it spells out the demands that high-quality rankings 
should meet. 

The available literature on life cycles in research productivity is oddly 
disconnected from the evaluation issue. The studies investigating life cycles 
are usually motivated by Gary Becker’s human capital theory that predicts 
that investment in human capital decreases over the life cycle, thereby gen- 
erating hump-shaped individual life cycles in labor productivity and earnings. 
Some scholars have extended the human capital approach to analyze the 
processes which are specific to research production. Others have used the 
standard human capital approach in order to guide their attempts to empiri- 
cally identify the determinants of labor productivity; these scholars focus on 
research production mainly because measuring research productivity is, in 
many respects, easier than measuring labor productivity in other fields. The 
AER paper by Levin and Stephan (1991) followed both of these routes and 
was instrumental in kicking off the field that is now known as the economics of 
science. 

Surprisingly few studies on life cycles in research productivity were written 
by economists or investigate the economics profession. This has already been 
deplored by Paula Stephan in her (1996) JEL survey. Recent work on the 
economics profession include Kenny and Studley (1996), Oster and Hamer- 
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mesh (1998) and Baser and Pema (2004) whose empirical results are com- 
patible with a hump-shaped progression of individual research productivity 
over the life cycle as hypothesized by Becker’s human capital theory. Good- 
win and Sauer (1995), on the other hand, who do not clamp the life cycle in 
the Procrustes bed of a quadratic specification, identify a bi-modal life cycle. 
Hutchinson and Zivney (1995) and Hartley et al. (2001) do not find any 
evidence supportive of the standard life cycle hypothesis at all. 

Among the many considerable econometric problems that arise when 
estimating life-cycles in research productivity, the most challenging one 
arguably consists of separating career age and cohort effects, an endeavor that 
is confounded by the fact that publication behavior has changed over time. In 
order to estimate life cycle and cohort effects separately, an extensive panel 
data set comprising many cohorts is indispensable, otherwise the potentially 
considerable cohort-specific influences cannot be estimated, and the resulting 
estimates of the life cycle pattern will be biased.' It is conceivable that, 
because of these econometric problems, the empirical evidence with respect to 
cohort effects is somewhat elusive. Basar and Pema (2004) do not find any 
cohort effects at all, and Goodwin and Sauer (1995) report only marginally 
significant effects that are tainted since they may well reflect the fact that the 
members of the analyzed cohorts differ in age, implying that the older cohorts 
are composed of academic survivors and thus liable to have been more pro- 
ductive on the average. 

The identification problem becomes even more challenging if one 
acknowledges that the publication behavior of economists has changed over 
time. Even if these changes have been relatively small, they may become 
significant in the course of a time period that allows estimating cohort effects. 
Since, however, career time, historical time and cohort affiliation depend on 
each other in a linear manner (career time = historical time — cohort “birth” 
year), only two out of the three effects can be estimated subject to some 
assumption about the development of the third one. This is the reason why all 
estimates of life cycle and cohort effects need to be interpreted with some 
caution.” 

This paper unfolds as follows. In the next section we present a new data set 
that describes the research behavior of German academic economists, and in 
section 3 we describe the heterogeneity of research production with respect to 
both age dimensions (career age and cohort affiliation). Our investigation of 


' Cohort-specific influences are, for example, the knowledge base transmitted during 
graduate education, the rate of obsolescence, access to resources, opportunities provided 
by the socio-economic environment, and modes of behavior imprinted on the fledgling 
scientists. See Stephan (1996, 1216-7). 

? For a detailed exposition of the econometric methods that have been proposed to 
identify age, cohort, and period effects on individual research productivity, see Hall et al. 
(2005). 
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heterogeneity culminates in the presentation of a simple formula that trans- 
lates any German economist’s research oeuvre into a ranking vis-a-vis his or 
her peers. Section 4 describes the results of some life cycle regressions. Since 
tenure represents the arguably most important special feature of the academic 
labor market, we analyze, in section 5, the persistence of individual research 
productivity in order to assess at what career stage promotion to a tenured 
position is justifiable. In section 6, we turn to evaluations of whole research 
units (German economics departments) and present some rankings that take 
the age dimension into account. 


2 The data set 


Most studies of research productivity over the life cycle employ a sample of 
scientists who are relatively active in research. The rationale for this approach 
is twofold. On the one hand, the behavior of choice researchers is better 
documented than that of less active ones. On the other hand, the standard 
econometric methods are better suited to process steady streams of activities 
than time series with many periods of inactivity. Since it is our intention to 
develop an evaluation scheme for all kinds of scientists, we did not follow this 
restricted approach and compiled a dataset that comprises, in principle, all 
academic economists currently working in Germany. 

Since we use the EconLit data base we had to restrict ourselves to econo- 
mists who received their doctoral degree at the earliest in 1969, the first year 
covered by EconLit. Considering that German academic economists receive 
their doctoral degrees when they are about 30 years old, this implies that the 
oldest economists in our data set were about 65 years old in 2004, the last year 
covered in our study. For these economists, we thus have complete life cycles. 
For the younger ones, the available life cycle becomes, of course, increasingly 
shorter. The shortest life cycles that we decided to consider have a length of 
six years which corresponds to a career age at which promising academic 
economists are granted tenure. We thus only consider scholars who received 
their doctoral degrees between 1969 and 1998 and who were employed at a 
German university in the year 2004 or have retired from such a position 
shortly before. 

On the basis of these restrictions we have analyzed the publication records 
of more than 600 economists. To be more precise, our data set is comprised of 
all EconLit-listed journal publications (up to the year 2004) authored or co- 
authored by the economists included in our sample. Evaluating only the set of 
journals referenced in EconLit excludes journals whose scope is not aligned 
with the current mainstream of economic research, new economics journals, 
and journals that do not meet EconLit’s quality standards. Whereas scope and 
timeliness are issues to be considered (scholars with peripheral or inter- 
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disciplinary specializations and scholars working on emerging fields may be 
underrated), exclusion because of insufficient quality does not appear to be an 
issue since the minimum quality standard set by EconLit is rather soft. 

The quality standards set by the journals indexed in EconLit are of course 
quite diverse. Any study working with this data base therefore needs to 
capture quality differences in one way or another. If a reward scheme does not 
take these quality differences into account, the scientists would no longer 
attempt to produce research output of the highest possible quality but would 
rather shift their efforts towards producing results that are just about pub- 
lishable in the journals with the softest quality standards. In other words: 
“Gresham’s law of research evaluation” would see to it that mediocre 
research drives good research out of circulation. 

A popular approach to controlling for journal quality is to use a subset of 
journals whose prime quality is uncontested. The ranking study by Kalaitzi- 
dakis et al. (2003), for example, followed this strategy. Restricting the journal 
set in this manner comes, however, at a significant cost. First of all, infor- 
mation especially about less accomplished scientists who do not publish in 
prime journals is lost with the consequence that reward schemes based on 
such a set of journals would not provide any incentives for this class of 
employees. A second drawback of restricting the journal set is that this 
strategy would prohibit us from investigating changes in research quality over 
the life cycle. For these reasons we decided to work with the whole set of 
journals indexed in EconLit, and to explicitly control for journal quality. 

The evaluation of journal quality represents a field of its own. From the 
plethora of weighting schemes we chose the “CLpn” scheme proposed by 
Combes and Linnemer (2003) because it is based on the journals’ relative 
(subjectively perceived) reputation and (objectively measured) impact, and 
thus appears to provide a well-balanced rating over the whole quality range.’ 
The CLpn-scheme converts each journal publication in standardized units of 
AER-page equivalents. The quality weight of the five top-tiered journals is 
normalized to unity. The sixteen second-tiered journals’ imputed weight 
amounts to two thirds. Weights then decline in discrete steps (one half, one 
third, one sixth) down to the minimum weight of one twelfth. Our variable 
that measures research productivity of researcher i on an annual basis (year T) 
is defined as follows: 


Wei 
(1) CLpn(T) =, 


KR Pò 


3 One disadvantage of this method is that journal quality is kept constant over the 
period of investigation that covers, after all, a time-span of 36 years. 
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where pra and nga denote the number of pages and the number of authors of 
researcher i’s publications k, while w,, denotes the appropriate journal 
quality weight. The CLpn-index thus not only controls for quality but also for 
the number of authors and the length of the journal articles.* 

In order to obtain comparable individual life cycles of research productivity, 
we merged the annual records of individual research productivity with the 
year in which the respective researcher obtained the doctoral degree, i.e., we 
align the individual life cycles by this reference year. Our data set also con- 
tains some coarse information about the included economists’ field of spe- 
cialization, and we also documented the researchers’ gender. Only about 7.5% 
of our academic economists are women. 15% of the economists in our sample 
specialize in microeconomics, 26% in macroeconomics, 34% in public eco- 
nomics and 16% in econometrics. Economists who could not be assigned to 
any one of these fields were assigned to the field OTHER. 


3 Describing the landscape of German academic research in economics 


In order to obtain a first impression of the size and distribution of the oeuvres 
of German academic economists, we cumulate the annual research outputs 
defined in equation (1) from career year —5 until career year t, where 0 
denotes the year in which the economists were granted their doctoral degrees: 


(2) R(t) = 5 CLpn;(T), 


T=-5 


and then compute for all career ages ft the borderline values of R for the 
following percentiles: 25% , 50% , 80%, and 90%. The resulting information is 
depicted in figure 1. 

Averaging over all economists in our sample we observe, first of all, that the 
oeuvre of the median researcher is quite modest. During his whole career the 
median German economist does not manage to produce more than 10 AER- 
equivalent pages. Assuming that all of this research has been published in 
journals belonging to the lowest quality tier, this implies that the median 
economist publishes about 6 journal articles (20 pages each) during his 
research career, i.e., one article every six years. Second, figure 1 reveals that 
the distribution of the individual research oeuvres is skewed to the right and 
exhibits a large variation. These characteristics do, of course, not come as a 
surprise. Rather, they constitute stylized facts that have transpired from many 


4 We did not, however, take into account that the number of words per page differs 
across journals. 
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Figure I 
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related studies.° More interesting is the fact that the percentile borderlines are 
not monotonous and exhibit a marked “overall” concavity. The violation of 
monotonicity of the stock variable R is not as puzzling as it might appear at 
first sight; it simply reflects cohort effects in our unbalanced panel. If research 
productivity increases dramatically across cohorts, the stock of the scientists at 
a young career age (measured across all cohorts) may well be larger than the 
stock of the scientists at an older career age (measured across only those 
cohorts who have reached this career age). The concavity of the percentile 
borderlines admits two interpretations: it may either reflect decreasing mar- 
ginal productivity over the life cycle or it may again represent an artifact of 
cohort effects in our unbalanced panel. 

In order to discriminate between the decreasing marginal productivity 
interpretation and the interpretation that presumes cohort effects, we analyzed 
the career-time oeuvres of different cohorts. For that purpose, we divided our 
sample of economists into five cohorts, each comprising six age groups. The 
oldest cohort comprises the age groups 1969-1974, and the youngest one the 
age groups 1993-1998. The members of the oldest cohort thus look back on a 
career of at least 30 years, while the members of the youngest one have had a 
career of at least six years. The percentile borderlines are now monotonous, 
indicating that vintage effects within the cohorts are relatively small. Figure 2a 
(see p. 108) presents the percentile borderlines for the oldest cohort.® 


5 The highly skewed nature of publication was first observed by Lotka in 1926 in a 
study on physics journals (cf. Stephan, 1996, 1203). 

é The working paper version of this article also presents the evidence for the other 
cohorts. 
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Figure 2a 
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Figure 2b 
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Two interesting insights transpire. First, eyeballing of the cohort-specific 
percentile borderlines does not suggest any pronounced concavity. An S- 
shaped life cycle productivity pattern supporting the factors portrayed by the 
standard human capital model thus cannot be identified, at least not at the 
aggregate level. To shed some more light on this issue, we will, therefore, 
further investigate our economists’ life cycles with the help of micro-econo- 
metric methods in section 4. The second feature that emerges is more con- 
clusive. The German economics profession is characterized by striking cohort 
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effects in research productivity: the percentile borderlines become increas- 
ingly steeper for younger cohorts. The increase in cohort-specific research 
productivity is illustrated in figure 2b in which the 80%-lines of the five 
cohorts are superimposed. This representation shows that it took an econo- 
mist who tops 80% of his peers in the oldest cohort about 18 years to accu- 
mulate an oeuvre of 20 AER-equivalent pages, whereas a top-80% economist 
of the second cohort managed to do so in 12 years. This time span is reduced 
to 8 and 4.5 years for the two following cohorts, respectively, and the top-80% 
economist of the youngest cohort only needs 3.5 years to produce 20-AER 
equivalent pages. 

From our data set we can extract information that is directly relevant for the 
evaluation of individual researchers. In particular, we can assign each econ- 
omist a peer-specific performance rank at each point of career time. This kind 
of information is of prime importance for a university management that wants 
to pursue a rational performance-related remuneration policy. Information 
about the standing of individual researchers vis-a-vis their peers is, moreover, 
a prerequisite for department rankings that are insensitive to the age structure 
of the evaluated faculties. We will turn to this issue in section 6. Whole career 
profiles in terms of relative performance are, finally, of vital importance to 
assess the persistence of research performance. The crucial question in this 
context is whether it is possible to forecast a scientist’s research performance 
from his track record, and if so, at which stage of a scientist’s career such 
forecasts are sufficiently accurate to serve as a basis for management decisions 
such as granting tenure or awarding substantial research grants. The persis- 
tence issue will be dealt with in section 5. Here we will follow up the first issue 
and ask ourselves how the information about the current cohort-specific 
ranking of individual economists can be condensed in such a way that it can 
serve as a simple management information device. 

To do so, we consider the standard situation faced by a university man- 
agement or a research foundation that would like to assess an economist’s 
relative research standing in the German academic profession. Usually, the 
evaluator has only access to this person’s CV including publication list. With 
the help of the publication list it is easy enough to compute via equations (1) 
and (2) the accumulated research output R at the end of the year 2004. 
Dividing this output R by the adjusted career age t (t=2010-Y, where Y 
denotes the year in which the evaluated economist received his or her doctoral 
degree) yields the average research productivity P.” How does the average 
research productivity P of an economist translate into a ranking vis-a-vis his or 
her peers? Since the relative research standing depends on the average 


7 We let the productive time of a researcher start five years before the doctorate. Since 
the doctorate takes place in career year t=0, the adjusted career age t= 2004-Y + 6 = 
2010-Y. 
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research productivity as well as on the cohort age of the person to be eval- 
uated, we are seeking a formula of the form S= f(P,Y), where S denotes the 
evaluated economist’s relative research standing in percentiles. Regressing S 
on Y and P yields the following formula: 


3 9.2 eee 

(3) S=183- 015° Y +0.55- VP. 

For evaluation purposes, the negative residuals of our regression (over- 
estimation) clearly present the relevant downward risk. Since the distribution 
of residuals resembles a normal distribution with a standard deviation of 
0.077, the probability of overestimating a candidate by 10 percentiles is about 
10%. This appears to be a risk well worth taking in a situation in which the 
alternative is to rely on peer evaluations and recommendations that are 
notoriously biased. 


4 A micro-econometric investigation of life cycle productivities 


The empirical evidence presented in the previous section suggests that life 
cycles in economic research productivity are rather flat. This evidence refers, 
however, to highly aggregated data. In order to do justice to the heterogeneity 
in our population of economists we exploited the micro-structure of our data 
set by regressing individual research productivity not only on career-time and 
cohort membership, but also on the field of specialization, on a gender dummy 
variable, and on a measure of ability. Following Goodwin and Sauer (1995), 
we ranked the researchers according to their cohort-specific average life-time 
productivity. We then defined quintile ranks within the distribution for each 
three-year cohort and assigned each researcher the appropriate ability rank. 

Since about three quarters of our observations of the dependent variable 
(research productivity of economist i in year t) are zeroes, one cannot apply 
OLS. To accommodate this high degree of censoring we used the hurdle 
model, i.e. we allow the decision making process to be more complex than the 
one captured by a standard Tobit model.'® The first part (being active) is 
portrayed with a Probit model, whereas the distribution of the positive counts 
is modeled with the help of a truncated Negative Binomial model since the 
observed density distribution of our dependent variable resembles the pattern 
of count data. 


8 The relevant peer group always consists of five age groups, namely the age group of 
the person to be evaluated and the four neighboring age groups. 

° Our formula approximates our regression result which explains 93% of the variance 
of S. 

10 For details and other estimation techniques, see the companion paper: Rauber and 
Ursprung (2007). 
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The results of our regressions are shown in the working paper version of this 
article. Our hurdle model focuses on heterogeneity with respect to ability, i.e., 
we include dummy variables for each ability rank and also allow the life-cycle 
polynomials to differ across the ability ranks 5 (top researchers), 4 (accom- 
plished researchers) and 1-3 (journeymen researchers).!! Figures 3a and 3b 
(see p. 112) visualize the fact that the time polynomials differ across ability 
ranks and that there are significant differences between the time polynomials 
of the Probit and NegBin part, thereby suggesting different forces governing 
the two respective processes. Our results indicate that the top-researchers 
manage to increase their publication incidence over time while their research 
productivity somewhat declines in the second half of their careers. It thus 
appears that the best researchers in the profession focus in the beginning of 
their careers on fewer research projects (articles) but execute them with more 
effort which gives rise to higher quality (better journals) and more extensive 
results (longer articles), and all this is achieved with fewer co-authors. Later 
on in their careers these researchers get involved in more projects that are, 
however, executed with less effort. The two processes (number of projects and 
research effort put into each project) neutralize each other and, in con- 
junction, give rise to the flat life cycles in overall research productivity already 
observed. Decomposing our measure of research productivity and regressing 
average quality, article length, and number of co-authors on our explaining 
variables indeed shows that older economists work together with more col- 
laborators (co-authors), write shorter articles, and publish in lower quality 
journals. Interestingly, however, top researchers manage to maintain quality 
much more than their less gifted peers.” 

As compared to the top-researchers, the “accomplished” researchers’ 
publication incidence and research productivity declines more sharply over 
their life cycles. These life cycles are thus better in line with the predictions of 
the human capital approach to explaining labor productivity. The “journey- 
men” researchers, finally, have rather flat and nondescript life cycles. 

The coefficients of the cohort dummies, not surprisingly, increase over time. 
This result is consistent with the joint hypothesis of more productive younger 
cohorts and a constant historical time effect. We admit, however, that it is not 
inconceivable that our regressions somewhat overestimate the identified 
vintage effects since the gradual substitution process towards publishing 
research results mainly in journals may still have been at work in the begin- 
ning of our period of observation. The estimated coefficients of the gender 
dummy variable indicate that female economists publish significantly less than 


1! Tt was necessary to bundle the first three ranks together because of the high degree 
of censoring within these ranks. Nevertheless, we still allow for different intercepts for 
each rank. 

12 See our companion paper: Rauber and Ursprung (2007). 
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Figure 3a 


Probability of being active 
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Figure 3b 


Conditional productivity by rank 
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their male peers. This negative effect, however, arises from the fact that 
female academic economists seem to be more likely not to engage in research 
at all. If female economists decide to be active researchers, then they are just 
as productive as their male peers. Our field dummies, finally, show that 
researchers specializing in macroeconomics are less likely to be active 
researchers, and active micro-economists publish more than their peers. Even 
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though these effects appear to be relatively small and fragile, it might be 
worthwhile to bear these field effects in mind when evaluating individual 
economists. 

In a second (standard Tobit) regression we focus on heterogeneity with 
respect to cohort membership. As in the hurdle model, we allowed the life 
cycle polynomials to differ, this time across our six cohorts. Figure 4 visualizes 


Figure 4 


Tobit estimates by cohort (macro rank 4) 
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the cohort specific time polynomials. It can be seen with the naked eye that 
the shape of these life cycles differs across cohorts: younger cohorts have more 
hump-shaped life cycles than older cohorts. With respect to the other 
explaining variables nothing changes dramatically. 

We thus arrive at the result that the life cycles of younger cohorts - as far as 
we can tell from the initial phases of these cycles - correspond more closely to 
the predictions of the standard human capital approach to explaining changes 
in labor productivity than the evidence we have for older economists. Various 
hypotheses lend themselves to explaining this result. The first and arguably 
most plausible one maintains that the academic environment has become 
increasingly more competitive over the last 35 years. In a more competitive 
work environment, employees who want to succeed are forced to optimize 
under the pertaining constraints. It is thus not surprising that their behavior 
more closely corresponds to the predictions of the human capital model that 
narrowly focuses on labor market incentives. An alternative hypothesis is that 
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doctoral students of older cohorts have been exposed to different role models 
than the younger cohorts. This hypothesis relates to the preference formation 
process which works through sociological imprinting. The last hypothesis does 
not assume a change in preference formation but different preferences of the 
people who decide to pursue an academic career. Whether it is possible to 
empirically discriminate between the three hypotheses (which are, of course, 
not mutually exclusive), remains to be seen." 


5 Persistence of research productivity 


The economics of science literature has clearly demonstrated that an academic 
scientist’s research productivity has a noticeable influence on his or her labor 
market success. First of all, research productivity varies positively with pay (cf. 
Kenny and Studley 1996 and Moore et al. 2001 for empirical evidence relating 
to the economics profession). A strong research record has, moreover, also a 
positive influence on the obtainable job status in terms of the employing 
university’s reputation (cf. Grimes and Register 1997 and Coupe et al. 2003), 
and scientists with strong research records are more likely to be granted 
tenure and to be promoted to higher academic ranks (cf. Coupé et al. 2003). 
Tenure and promotion to the highest level of the academic hierarchy may, on 
the other hand, have detrimental effects on research productivity because 
these types of upgrading are irrevocable and thus reduce incentives to work 
hard. Backes-Gellner and Schlinghoff (2004), for example, have show that 
research productivity of German (business) economists increases before the 
only crucial career step (appointment to a professorship) and is reduced 
afterwards. An early study on the impact of tenure that arrived at similar 
results for the United States is Bell and Seater (1978). 

Precisely because irrevocable career steps are liable to have a certain 
influence on research productivity, it is important to know at what stage of the 
academic career the research potential of a scientist can be assessed with 
reasonable accuracy and to what extent this potential is liable to be used in the 
post-tenure period. In other words, it is (from a managerial point of view) 
important to possess firm information on the persistence of individual research 
productivity. Inspection of our aggregate and individual data has already 
revealed that research productivity in our sample of economists is charac- 
terized by a great deal of persistence. In this section, we focus on the question 
whether the traditional American policy to grant, postpone, or decline tenure 


3 See Frank and Schulze (2000) for an experimental design to test a related set of 
hypotheses. 

14 For a recent theoretical study of tenure and related incentive schemes in academia, 
see Dnes and Garoupa (2005). 
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after a review period of six years does make sense in the light of our empirical 
evidence. Many knowledgeable observers agree that young scientists have to 
wait too long to be promoted to a professorship in the German university 
system. On the average, the implicit probation period amounts to eight years 
(German economists obtain their doctoral degrees when they are about 30 
years old and are, on the average, appointed to their first professorship at the 
age of 38). The objective of the investigation presented in this section is to 
inquire whether the review period could indeed be shortened without great 
loss in terms of evaluation accuracy. 

As compared to tenure-induced effects on research productivity, the optimal 
timing of the tenure decision has not found a great deal of attention in the 
scientometric literature dealing with the economics profession. A notable 
exception is the study by Hutchinson and Zivney (1995). These authors 
regress the average annual post-tenure productivity (measured in numbers of 
journal articles) on the pre-tenure oeuvre of economists using two hypo- 
thetical review periods, namely the standard six years and four years. Their 
regression analysis leads them to concur with Bell and Seater’s (1978) con- 
clusion based on cross-sectional data “that granting of tenure seems to have 
negative effects on individual publishing performance” (1978, 614). “Yet, 
because the negative effect is so small numerically, 0.01 articles per year, our 
results indicate that publishers maintain essentially constant pre- and post- 
sixth-year rates of publication over their post-doctorate years. Moreover, 
shortening the review period from six years after the doctorate to four, relying 
upon our 1969-1979 doctorates, only slightly reduces the ability to predict 
future journal publication rates based on existing journal publication infor- 
mation while also producing almost constant pre- and post-fourth-year rates 
of publication” (Hutchinson and Zivney 1995, 74). 

In order to check whether the German economists’ academic standing 
reached by their sixth year after the doctorate is a good indicator for their 
mid-career reputation (at the approximate age of 42, i.e., in the twelfth year 
after the doctorate), we ranked all economists in our sample at career time t= 
6 according to the size of their oeuvres in relation to a special five year cohort 
for each class.'® We then define quintile ranks and assigned each researcher 
the appropriate rank. Repeating this procedure for the career year t= 12, we 
arrived at the mid-career ranking of the same economists and were then able 
to compute the probability of moving from one quintile rank to another within 
the observation period. These transition probabilities are shown separately in 
table 1 for the older economists in our sample (classes of 1969 to 1980) and for 
the younger ones (classes of 1981 to 1992). Due to the inescapable problem of 


15 Members of the class of 1981, for example, are ranked in the cohort comprising the 
classes of 1979 up to 1983. 
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Table] Transition probabilities: year 6- year 12 


1&2 3 4 5 
1&2 Coh. 1 0.80 0.14 0.04 0.02 
Coh. 2 0.83 0.13 0.04 0.00 
3 Coh. 1 0.30 0.41 0.24 0.05 
Coh. 2 0.28 0.44 0.23 0.05 
4 Coh. 1 0.02 0.37 0.47 0.14 
Coh. 2 0.00 0.37 0.56 0.07 
5 Coh. 1 0.00 0.00 0.19 0.81 
Coh. 2 0.00 0.00 0.14 0.86 


Cohort A: 1969-1980 
Cohort B: 1981-1992 


research-inactive scholars we had to group the first two quintiles together with 
the consequence that the probabilities in the columns do not add up to 100%. 

The results summarized in table 1 once more show that research production 
is indeed characterized by a great deal of persistence. The probabilities on the 
main diagonal are substantially larger than the off-diagonal probabilities, 
implying that marked changes in the academic standing are low probability 
events. Table 1, in particular, shows that appointing a young professor with a 
high reputation is a relatively safe bet these days. On the other hand, 
appointing a professor with a bad publication record and hoping (perhaps 
based on hearsay) for the best, is not much more than wishful thinking. The 
probability of a bottom group researcher making it in the first six years of his 
or her full professorship to the top 40% is nowadays not more than 4 out of 
100.16 Table 1 also documents that the research track record has become a 
better indicator of future research productivity over the years. The transition 
probabilities of the younger economists are more centered on the main 
diagonal than those of the older economists. 

The evidence summarized in table 1 documents that, currently, a six year 
review period provides ample evidence for an informed tenure decision. The 
question therefore arises as to whether the German method of appointing 
professors (i.e., after an average review period of eight years) is indeed sig- 
nificantly superior in terms of avoiding bad appointments to justify the cost 
(especially the attendant loss of appeal to pursue an academic career). To 
investigate this question, we have computed the transition probabilities of the 


16 Notice that the persistence documented in table 1 is, of course, to some extent 
predicated by the question we ask, i.e. by the fact that we use stock data that reflect 
reputation. Using flow data would certainly increase the inter-quintile transition prob- 
abilities. 
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Table2 Transition probabilities of cohort B 


1&2 3 4 5 
1&2 4-12 0.80 0.13 0.05 0.02 
6-12 0.83 0.12 0.05 0.00 
8-12 0.88 0.11 0.01 0.00 
3 4-12 0.30 0.39 0.26 0.05 
6-12 0.28 0.44 0.23 0.05 
8-12 0.24 0.62 0.14 0.00 
4 4-12 0.07 0.35 0.45 0.13 
6-12 0.00 0.37 0.56 0.07 
8-12 0.00 0.19 0.70 0.11 
5 4-12 0.00 0.02 0.19 0.79 
6-12 0.00 0.00 0.14 0.86 
8-12 0.00 0.00 0.12 0.88 


younger German economists also for hypothetical review periods of eight and 
four years. The results are summarized in table 2. Given that we work with 
stock variables, it is not surprising that the predictions become somewhat 
sharper when using an eight instead of a six year review period, and somewhat 
more diffuse when using a four year period. More interesting is the fact that 
reducing the review period from the German standard of eight years to the 
American standard of six years does not appear to come at an inordinate loss 
of information. Research excellence, in particular, can be detected after six 
years just as well as after eight years. In many cases of truly superior young 
scientists, a review period of four years may well be sufficiently long to make a 
reasonably safe appointment decision. Our conclusion is thus in line with the 
results derived for the United States by Hutchinson and Zivney. 


6 Some new rankings for German economics departments 


If one agrees that the evaluation of individual researchers should take career 
age and cohort affiliation into account, then these age dimensions should also 
be considered when ranking whole departments. After all, meaningful 
department rankings are supposed to reflect the research competence of its 
members and not the age structure of the departments’ faculty. In this section 
we therefore present some rankings of German economics departments that 
reflect the life cycle dimension of the evaluated faculties. The objective is to 
demonstrate how, in principle, such rankings can be conceptualized and to 
show how rankings that incorporate life cycle information compare to tradi- 
tional rankings that do not do so. 
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We decided to produce rankings that are comparable to the research 
rankings published by the Centrum für Hochschulentwicklung (CHE) because 
the CHE-rankings, even though criticized by an impressive number of 
knowledgeable observers of the German research landscape, nevertheless are 
quite influential. The reference groups of the CHE-rankings are the tenured 
professors of the respective departments. Whether this reference group con- 
stitutes a meaningful basis for an evaluation is questionable. Nevertheless we 
adopt here this approach in order to provide results that are easily comparable 
to an established German standard. 

The rankings that are presented in table 3 refer to 52 economics depart- 
ments. All of these departments confer degrees in economics and belong to a 
German university; we thus do not consider economics departments of sec- 
ond-tier universities, the so-called universities of applied sciences. One of the 
main (but little appreciated) challenges of current potential rankings as 
compared to work-done-at rankings consists in the identification of the 
respective faculty members. Since some of the faculty lists used by the CHE 
are grossly at variance with a truthful representation, we decided to base our 
rankings on a revised set of faculty lists that is reproduced in the appendix of 
the working paper version of this article. 

Our first ranking (see column A in table 3) simply represents the mean of 
the individual research standings of the respective faculty members, where the 
individual research standing is defined via the percentile value of average life- 
time research productivity within a three years cohort comprising all econo- 
mists who received their doctoral degrees in the same year as the evaluated 
individual or in a neighboring year. Since these overlapping three-year cohorts 
are rather small for some years, we also show a ranking using cohorts of five 
years (column B). The rankings appear to be quite insensitive to the chosen 
cohort size: only three out the 52 ranked departments move by three ranks 
and one (Liineburg, one of the two smallest departments with three pro- 
fessors) by four ranks across the two rankings. The two first rankings are thus 
very similar which is confirmed by a rank-correlation coefficient amounting to 
99.6%. 

As far as the top-ranked departments are concerned, the results of the first 
two rankings confirm, in essence, the results of earlier studies and the 
assessment of informed observers of the German economics profession (see, 
e.g., Ursprung 2003). Somewhat surprising is perhaps the fact that the LMU 
Munich is only placed 9.1” 

The first two rankings do not take into account that the research standing of 
individual economists is sensitive to their respective field of specialization. As 


17 More important than the rank is of course the numerical value of the variable on 
which the ranking is based (these values are reported in the working paper version of this 
article). In this respect ratings are more meaningful than rankings. 
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Table 3 

Life Cycle Standard 

A B C D E F 
FU Berlin 6 6 7 6 9 4 
HU Berlin 5 5 1 3 8 5 
HWP Hamburg 49 48 44 48 49 47 
LMU München 9 9 6 i 6 3 
RWTH Aachen 11 12 19 13 4 9 
TU Berlin 25 24 27 27 23 29 
TU Chemnitz 43 44 46 45 42 43 
TU Dresden 18 16 16 14 14 16 
Uni Augsburg 30 28 37 28 31 37 
Uni Bamberg 37 37 42 31 35 38 
Uni Bielefeld 7 7 10 9 11 13 
Uni Bonn 1 1 2 1 1 1 
Uni Bremen 46 47 40 50 51 50 
Uni Dortmund 10 11 8 11 12 12 
Uni Duisburg-Essen 44 43 45 44 44 42 
Uni Erfurt 41 41 41 38 25 20 
Uni Erlangen-Niirnberg 24 25 31 25 24 22 
Uni Frankfurt/Main 13 13 11 12 10 10 
Uni Frankfurt/Oder 8 10 12 10 13 18 
Uni Freiburg 33 33 29 35 32 30 
Uni GieBen 35 36 43 39 46 49 
Uni Gottingen 34 34 34 29 26 28 
Uni Halle-Wittenberg 40 39 33 41 40 41 
Uni Hamburg 32 32 26 34 33 31 
Uni Hannover 20 19 25 23 19 26 
Uni Heidelberg 23 26 18 16 29 7 
Uni Hohenheim 26 23 22 24 22 27 
Uni Jena 52 51 47 49 47 48 
Uni Karlsruhe 36 35 32 33 37 34 
Uni Kiel 3 2 5 3 2 8 
Uni Köln 27 27 36 26 36 33 
Uni Konstanz 2 3 3 4 3 6 
Uni Leipzig 48 49 48 47 48 46 
Uni Lüneburg 12 8 9 8 5 11 
Uni Magdeburg 19 20 20 21 15 25 
Uni Mainz 16 18 15 19 21 17 
Uni Mannheim 4 4 4 2 7 2 
Uni Marburg 39 40 38 32 30 14 
Uni Münster 42 42 35 42 38 36 
Uni Oldenburg 21 21 21 20 20 15 
Uni Osnabrück 14 14 14 18 28 35 
Uni Paderborn 50 50 51 51 50 51 
Uni Passau 31 30 30 36 34 32 
Uni Potsdam 28 31 24 37 39 39 
Uni Regensburg 17 17 13 17 18 24 
Uni Rostock 45 46 49 46 45 40 
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Table 3 (cont.) 


Life Cycle Standard 

A B C D E E 
Uni Siegen 29 29 28 30 27 23 
Uni Stuttgart 47 45 52 43 43 45 
Uni Trier 51 52 50 52 52 52 
Uni Tübingen 15 15 17 15 16 19 
Uni Würzburg 22 22 23 22 17 21 
UniBW Hamburg 38 38 39 40 4 44 


A: Life Cycle 3 years 

B: Life Cycle 5 years 

C: Life Cycle 3 years with field correction (mean + 3/ — 3 years) 
D: Formula 

E: Standard Approach: ranking within total Dataset 

F: Standard Approach: simple average of productivity 


we have shown in section 4, the field of specialization has a statistically sig- 
nificant influence on our measure of research productivity. The ranking pre- 
sented in column C of table 3 therefore adjusts for these field-specific dif- 
ferences in publication behavior by aligning the field-specific means. This 
ranking is still closely correlated to the former ones: the rank-correlation 
coefficients amounting to 96.6% and 96.5%, respectively. Now we observe 
however quite a few larger deviations in individual rankings. Nevertheless, the 
group of leading departments does not change as compared to the baseline 
rankings. 

Thus far our rankings were based on orderings of individual scientists within 
narrow peer groups. One could argue that relying exclusively on actual data of 
relatively small cohorts may, in some cases, bias the evaluation of individual 
scientists and thereby give rise to unfair rankings. If, for example, unusually 
many first-rate scientists happen to be of approximately the same age, sci- 
entists who have the “bad luck” to be their contemporaries appear to be 
mediocre even when their overall research record is quite good, simply 
because they are compared only to their immediate cohort peers who are, 
coincidentally, very good. This kind of bias can be avoided by using our for- 
mula presented in equation (3) — albeit at the cost of losing some information. 
The ranking presented in column D of table 3 is based on the ranking of the 
respective faculty members according to our formula. Since the formula-based 
ranking in some instances does markedly differ from the baseline ranking that 
uses actual cohort data we conclude that the identified bias may have an 
undue effect even in the aggregate. 

The last two rankings presented in table3 do not take the life cycle 
dimension of individual research productivity into account. They are based on 
a method that is similar to the method used by Combes and Linnemer (2003) 
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in their “career” rankings, i.e., we compute the average research productivity 
of each department member and then either use the department-average of 
the respective percentile rankings (column E) or the average of the individual 
productivities (column F). Comparing these standard rankings with our 
baseline ranking demonstrates that life cycle effects are not only significant 
for the evaluation of individual scientists but also for the ranking of whole 
departments (the rank correlations between ranking E and F and the ranking 
A amount to 94% and 88%). Consider, for example, the department of the 
LMU. According to the standard ranking E, the LMU is ranked 6% while 
according to our life-cycle rankings A and B it is ranked only 9". This drop is 
apparently due to the fact that the most productive members of the LMU 
department are relatively young; neglecting the fact that young economists 
are in general more productive than older ones thus gives rise to an over- 
estimation of the department’s research standing. The cases of Frankfurt a.M., 
the two small departments of the RWTH Aachen and Lüneburg, and Erfurt 
are similar. The departments of the FU and HU Berlin, Mannheim, Bielefeld, 
Frankfurt a.O. and Osnabrück represent the counterpart category. These 
departments do significantly better when life cycle effects are taken into 
account. In these departments it is thus the old guard that is more productive — 
at least in relative terms. 

The last ranking (F) is more sensitive to outliers than ranking E because 
there is no upper bound for individual productivity. Extremely productive 
scientists thus give rise to a non-representative department average. Which of 
these two standard rankings is to be preferred depends of course on the 
context of the investigation. In any event, these two standard rankings clearly 
support our main argument: life cycle considerations also matter for research 
rankings of whole university departments. 
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Evaluation of Researchers 


Comment by 
WERNER GÜTH 


1 Introduction 


My assigned task is to comment in an academic way on the academic 
exploration of German academic researchers. This may be compared to stu- 
dents of psychology who study psychology to learn more about themselves. 
And let us not deny it - the fact that this is about us renders the paper as 
something we would not want to miss reading. 

Of course, we are all aware of some of the research by some of our col- 
leagues. So what is new here is the very systematic collection and aggregation 
of publication success of German academic economists with doctoral degrees 
from 1969 to 1998 who were employed by a German university in 2004. These 
data are not readily available and, although one may argue that the data are 
rather selective and possibly even biased, providing such a data basis has to be 
highly appreciated. 

The analysis of the data is impressingly thorough by 

— distinguishing different cohorts of researchers with probably very dif- 
ferent research environments during the various stages of their career, 

— different types (top, frequent publisher) of economists in each cohort, 

— following the life cycle regarding publications, and 

— decomposing publication scores into their components (journal quality, 
number of pages, number of coauthors). 


Clearly, as demonstrated by the authors, such results can be used for both, 
evaluating an individual researcher, e.g., by comparing her or him with the 
average researcher in her or his cohort, as well as evaluating faculties, e.g., by 
determining their cohort or age adjusted average quality. 


2 Measuring publication record 


Like in all empirical work, one might complain about the data which the 
authors analyze. Economics is just one of the social sciences which can do 
both, gain by importing ideas from neighboring fields and inspire research in 
neighboring fields. Researchers who engage in such interdisciplinary exchange 
might complain about using just economic literature data bases. It would 
probably not question the main conclusions but it would be comforting to 
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know that by broadening the data base, e.g., by including law journals (to 
account for a field like “law and economics”) or journals of social, cognitive 
and economic psychology, nothing essential changes. 

The authors do not consider citation data, presumably since they are 
manipulable, e.g., by forming citation cartels. If so, this would not be entirely 
convincing. Editors should soon find out such attempts and take precau- 
tionary measures. And why are publication scores not at all or at least much 
less manipulable? Actually, the share of citations to research reports (mon- 
ographs, non-economic journals, etc.) outside the economic literature data 
base used by the authors could indicate the selection bias and how repre- 
sentative the data are for all the scientific work of the included German 
academic economists. The description: “... that the oeuvre of the median 
researcher is quite modest. During his whole career the median German 
economist does not manage to produce more than 10 AER-equivalent 
pages...” could be read as a warning that the data source is quite selective 
rather than stating that German economists are quite unproductive. Citation 
data, job offers, etc. might reveal how the acknowledgement of individual 
researchers by their peers and the publication scores, as measured by the 
authors, are correlated. So far, this is still questionable. 


3 Cohort-specific life cycles 


Not only industrial but also scientific production experienced quite dramatic 
changes in the impressingly large time span covered by the data. We now use 
analytical and statistical software, much improved computing hardware, better 
data, text systems, etc. and the academic labor markets are, of course, now 
much more competitive. So comparing a young and a senior researcher’s 
publication score would not be fair. The authors avoid this by distinguishing 
quality types (top, frequent, ... publishers) only within a given cohort and by 
cohort weights for individual researchers when comparing faculties. This is 
entirely convincing. Estimating the life cycle not only for different cohorts but 
also for different quality types of each cohort renders the analysis even more 
convincing. 

The explorative analysis via time polynomials up to the fourth degree yields 
convincing results which could be checked for robustness by piecewise linear 
life cycles or dummies for time intervals of one’s career. Due to the many 
researchers with 0-publication score, the overall estimation (table I of the 
working paper version) first estimates the probability of publishing at all and 
only then how much one publishes where this, furthermore, can depend on the 
field (micro-, macro-, public economics, econometrics). 

From the perspective of an individual researcher, the idea of a life cycle 
suggests, of course, a strong path dependence, e.g., in the sense that past 
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publications reflect human capital or habit formation as, for instance, captured 
by a (possibly weighted) previous publication score as an aspiration level for 
future research. The authors analyze this by the transition probabilities of 
“quintile” persistence (tables II and III of the working paper version) for two 
time points (career years 6 and 12) and confirm quite some persistence of 
publication habits. 


4 Faculty ranking 


When publication activity is cohort and career age specific, as convincingly 
illustrated by the authors, evaluating faculties by just comparing per capita 
publication scores is usually done but appears quite arbitrary by favoring 
faculties whose members predominantly belong to more productive (the later) 
cohorts and/or are in their most productive career stage. The authors use their 
method to correct this, i.e., by assessing for each faculty member the pub- 
lication type by her or his percentile ranking in view of her or his total life- 
time publication score in the respective cohort (varying the length of cohorts 
to check for robustness). The fact that the authors rely partly on field-specific 
(micro-, macro-, public economics, econometrics) adjustments suggests that 
faculties nowadays specialize by trying to focus on specific fields. 

This, naturally, changes the ranking of faculties with some losing in rank 
(e.g., the University of Munich (LMU) whose faculty members were mainly 
young but not necessarily overproductive) whereas others gain (e.g., the fac- 
ulties in Berlin (FUB, HUB) whose “old-timers” are relatively productive). 
Given the authors’ affiliation, members of the losing faculties might suspect a 
self-serving attitude. But even they should concede that it would be a rather 
sophisticated and intuitive way of evaluating in a self-serving way. One also 
would like to ask: if one accounts for the age composition of a faculty, why 
does one not account for heterogeneity in other aspects like number of stu- 
dents, PhDs, habilitations, type of graduate education, etc.? 


5 Suggestions 


The authors motivate their study by the apparent need to assess the promise 
of individual scholars as well as of faculties when deciding academic promo- 
tion and when designing funding schemes for faculties and universities. Do 
they really believe that we do it only just for the money? Did not old German 
academic economists without any monetary incentive to publish prove the 
opposite? 

It definitely is interesting to explore the best reward and funding schemes 
when money rules the academic world. But one should not forget that aca- 
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demic life leaves us a lot of freedom what to research - as the authors illus- 
trate, research can be even self-reflective —, with whom we cooperate and 
offers a lot of exciting experiences by attending workshops and conferences 
and spending sabbaticals abroad. For many of us this seems to be quite 
important. In this sense, the study is very informative and inspiring but one 
should refrain from policy recommendations before having discussed how 
decisive some of its shortcomings are. The authors selectively measure 
(publication) success, neglect habit formation and intrinsic interest in aca- 
demic research, and do not pay attention to other success measures like 
citation impact, external funding, external offers, etc. 

On the other hand, the data basis offers several chances to answer new 
questions (Are American coauthors more helpful than German ones in 
improving one’s life cycle?) or old ones anew (Are scholars with names early 
in the alphabet more successful? Are there especially good faculties for cer- 
tain fields?) to mention just a few. 


Markets versus Contests for the Provision of Information 
Goods 


by 
MARTIN KOLMAR* 


1 Introduction 


“It appears that patent policy is a very blunt instrument trying to solve a very delicate 
problem. Its bluntness derives largely from the narrowness of what patent breadth can 
depend on, namely the realized values of the technologies. As a consequence, the 
prospects for fine-tuning the patent system seem limited, which may be an argument for 
more public sponsorship of basic research.” 

(S. Scotchmer, Journal of Economic Perspectives, 1991). 


“In the field of industrial patents in particular we shall have seriously to examine 

whether the award of a monopoly privilege is really the most appropriate form of reward 

for the kind of risk bearing which investment in scientific research involves.” 
(Friedrich Hayek, Individualism and Economic Order, 1948). 


The classical justification for patents emphasizes the positive effects of patents 
on the incentives to invest in innovation. Granting a temporally restricted 
monopoly right increases the incentives to invest in research; however, if 
perfect price discrimination were not possible there would exist a tradeoff due 
to the welfare loss associated with the monopoly right. These welfare losses 
restrict the optimal term of the patent. 

Recent research has challenged this orthodoxy by focussing on com- 
plementarities in environments where production requires the use of multiple 
patents. The analysis of a complementary-goods oligopoly dates back to 
Cournot (1838) who found that prices tend to be higher and quantities tend to 
be lower than in a monopoly. For obvious reasons this result is also called the 
tragedy of the anti-commons because it is exactly the existence of property 
rights that leads to an inefficiency (Buchanan and Yoon 2000, Depoorter, 
Parisi, and Schulz 2001). 

In addition to the anti-commons problem that is based on a negative 
externality between different patent holders (contrary to the negative exter- 
nality in the case of a substitutive-goods oligopoly), a number of other patent- 
system related problems are being discussed, ranging from holds-ups when 
patents are complementary, excessive information costs due to an excessive 
number of potential patent infringements, inefficient designing-around efforts 
that raise costs and/or reduce product quality, etc. (Shapiro 2001, 2004). 


* T thank Salvatore Barbaro, Roland Kirstein, Dorothee Schmidt, and Dana Sisak for 
helpful comments. 
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Shapiro (2001) has coined the term ‘patent thicket’ to focus attention on the 
transaction costs that are associated with highly fragmented property rights. 

The inefficiency of a specific mechanism, however, cannot be considered to 
be problematic as long as it has not been demonstrated that alternative 
mechanisms exist that support equilibria that — given a normative criterion — 
dominate the equilibrium of the patent mechanism. The literature mentions 
alternative mechanisms, however, without explicitly analyzing their specific 
properties. 

One obvious class of alternative mechanisms is a contest (see Tullock 1980) 
or tournament where researchers compete for a prize, for example, research 
grants, tenure positions, etc. Contests award the price according to a relative- 
performance measure, for example, scientific publications. Each researcher 
can influence his probability of winning the prize by increasing his research 
activities. There is a large literature on contests and tournaments that analyzes 
the properties of this type of mechanism, mostly for the case of private goods 
(see Lazear 1997). 

There is one specific feature of research or information goods, namely, that 
they are non-rivalrous in nature (see Che and Gale 2003, Kolmar and 
Wagener 2004, Morgan 2000). Similar to the tournament literature, invest- 
ments in a contest are socially productive. The specific feature of the pro- 
duction process of information goods is, however, that the resulting goods are 
non-rival. Basically, the idea is to introduce a compensating (i.e., negative) 
externality to resolve the under-provision dilemma present in voluntary- 
contribution games in the provision of public goods. Information goods are 
transformed into public goods if the exclusion mechanism is not applied. The 
idea of a compensating externality to promote the production of public goods 
can already be found in Cornes and Sandler (1984, 1994). 

One of the novel aspects of this paper is to understand exclusion as a social 
agreement and not an exogenous property of the specific good. If the society 
decides to grant patent rights on innovations, the exclusion mechanism is 
applied. If, on the other hand, the society decides not to grant patent rights, an 
information good becomes public property. The decision whether to grant 
property rights or not should therefore depend on the economic costs and 
benefits of the alternative mechanism. 

In a transaction-costs free world one would assume that both types of 
mechanisms turn out to be equally efficient. It is therefore of crucial impor- 
tance to understand the idiosyncratic transaction costs of both types of 
mechanisms. In this paper we identify a new potential problem of the patent 
mechanism, namely, that the holder of a patent may have an incentive to 
inefficiently restrict the number of licenses sold. We restrict attention to 
information goods that have the character of a process innovation that - if 
applied — reduces the production costs of firms in a downstream market. 
Hence, a patent holder can influence the competitive structure on the 
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downstream market with his license-policy. We show that for the case of an 
oligopolistic downstream market with Cournot-competition the innovator will 
inefficiently restrict the number of licences if the cost-differential associated 
with the application of the innovation is relatively large. Under these cir- 
cumstances the patent system has welfare costs in terms of a reduced sum of 
rents on the market. 

However, these welfare costs may be unavoidable as long as it is not pos- 
sible to characterize an alternative mechanism that avoids these inefficiencies 
without creating new types of welfare losses that are even worse. For the 
special case of risk-neutral innovators we show that under a mild restriction 
the contest mechanism in fact dominates the patent mechanism. However, if 
one allows for risk-averse innovators the introduction of a contest increases 
the individually perceived risk. No clear-cut results concerning the optimal 
balance of both types of mechanisms have been derived for this case. How- 
ever, we show for a special case that both types of ‘corner solutions’ may turn 
out to be second-best optimal. Depending on the specific structure of the 
model it can turn out that the patent mechanism dominates the contest 
mechanism and vice versa. 

The paper proceeds as follows. The model is introduced and solved in 
Section 2. The model has three stages and is solved by backwards induction. 
Therefore, the equilibrium on the downstream market is analyzed in Sec- 
tion 2.1. The optimal license policy of a patent holder is analyzed in Sec- 
tion 2.2. Section 2.3 analyzes the incentives to innovate and the optimal bal- 
ance of incentive schemes. In Section 2.3.1 the analysis is restricted to risk- 
neutral innovators. It is extended to risk aversion in Section 2.3.2. Finally, the 
optimal balance of incentive schemes with risk-averse innovators is derived 
for a specific functional specification in Section 2.3.3. Section 3 concludes. 


2 The model 


We assume an economy with one representative innovator from a set of n 
potential innovators (researchers) with generic index i. Every innovator can 
devote /;>0 units of time and effort to the production of innovations. He 
derives utility out of monetary income, y;, (positive) and time and effort spent 
for research (negative). All innovators have identical utility functions that are 
linear in /; and concave in y; and that are consistent with the von-Neumann- 
Morgenstern axioms for expected utility, 


(1) u, l) =v) — l 


For simplicity we assume that there is a measure of scientific output, for 
example, published research papers, that is perfectly correlated with /;. Hence, 
we will use /; as a measure for scientific output directly in the following 


130 Martin Kolmar 


analysis. Scientific output is an information good, which means that it is 
nonrivalrous in a sense that will be exactly defined below. 

We assume that scientific output has the character of a process innovation: 
if an innovation occurs it has the potential to reduce production costs in a 
downstream product market. To be more specific, there are m potential firms 
with generic index j who produce quantities x = {x}, ..., x,,} of a consumption 
good by means of a linear cost function C(x;) =c-x;. Without scientific 
innovation the unit costs are equal to c, > 0. If an innovation occurs, any firm 
that uses this innovation is able to reduce its unit costs to c; € [0,c,). The 
innovation is nonrival because the use of it by firm j does not preclude firm k 
from using it and the costs of an additional user are equal to zero. The market 
exists for a span of time T, and for simplicity we abstain from discounting. 

An innovation is called indispensable or perfectly complementary to the 
production process, if c, oo. This notation allows it to refer to other results 
in the literature that focus on the role of complementarities between different 
inputs. We assume that all firms supply a homogenous good and engage in 
Cournot competition. 

The incentives to devote time and effort to the production of scientific 
output are given by means of two basic mechanisms. 

— First, there may exist a tournament-type incentive scheme that can be 
thought of as a contest among scientists for research grants or for lecturer or 
tenure positions at research institutes or universities. This contest maps any 
vector of individual scientific output l= {L}, ..., /,} into a vector of probabilities 
y={v1, -- Yn} for getting a fixed prize z>0. To be more specific we assume 
that this contest is of the Tullock-type 


(2) yi(l) = Sur 


— Second, there exists a patent system that protects the innovation for 
t € [0,7] periods of time. We assume that the time and effort spent for 
research influences the probability of generating a patentable idea. Let 
p;(l) € [0,1], o'(.) > 0, p'(0) — ©, p’(.) <0 be the probability of generating a 
patentable idea. For simplicity we assume that the direct and opportunity 
costs of getting a patent are zero. If researcher i has a patent on an innovation, 
he can sell it to s € (0,1,...m) in the downstream market. For simplicity we 
assume that there is no discounting of future payoffs. 


Both incentive schemes can be summarized by a vector {z, t}. Denote by s-r; 
the sum of royalties paid by s firms using the innovation. Taken together, 
innovator i has the following modified utility function: 
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(3) E [u(t] = pelh yv +s: r) +A —p))yMu-n) 

+A- y)v(z) — l, 
where we have w.l.o.g. normalized the utility of y,=0 to zero. For the special 
case of risk neutrality, the underlying preferences can be represented by a 
utility function 


(4) ul) = yz + pl)s 1; — Li 


If neither incentive system exists, it follows from (1) that /;=0 for every 
researcher i. If society decides to use only the contest mechanism and not to 
protect the innovation by a patent system, t=0, the innovation falls into 
public domain, and every firm j can use it for free. If society decides to 
increase incentives by means of the patent system, every firm that uses the 
innovation within the first ¢ periods has to pay royalties n to the innovator. 
Afterwards the innovation again falls into public domain and can be used for 
free. 

The sequence of the game is as follows: 

1. At stage 1 the potential innovators simultaneously and non-coopera- 
tively choose /; and the prize is awarded according to (2). Because we are 
ultimately interested in the optimal design of an incentive scheme {z, t}, the 
comparative-static behavior of this equilibrium would have to be taken into 
consideration. 

2. At stage 2 an innovator who has been successful in developing a patent 
bargains with the potential users of the innovation, firms 1, ..., m about the 
royalties paid. The outcome of the bargaining determines the number and 
identity of low-cost firms on the downstream market during the time span t 
that is protected by the patent. 

3. At stage 3 the firms at the downstream market determine their optimal 
production plan. 


The game is solved by backwards induction. 


2.1 Stage 3 


There are two possible scenarios on the downstream market. During the 
period of patent protection, only those firms that have paid royalties to the 
innovator have access to the low-cost technology, whereas afterwards all firms 
have access to it. Fortunately the first case is a special case of the second and 
we can therefore restrict attention to the solution of the following game: 
assume that s<m is the number of low-cost firms in the market. Market 
demand is given by the function p(x) = (a — Byam): The maximization 
problem of both types of firms is given by 
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(5) T(x) = t(p(x) ~~ Cy )X); 
if it is a high-cost firm, and by 


(6) m(x) = t(p(x) — a); — ti, 


if it is a low-cost firm. It is straightforward to show that a Cournot-Nash 
equilibrium of this game has the following structure: all low-cost firms pro- 
duce the same quantity and all high-cost firms produce the same quantity that 


is given by 
wp a- (+ s)e, + 8¢; 
(7) x, (5) (1+ m)b , 
r a- (1 +m -s)cı + (M — 8)cp 
(8) el (i+ m)b 


for an interior solution. The associated profit levels are equal to 


w (s) = „a — (1 Esher + sey 

(9) (1+m)’b 

(a—(L+m-—s)c, + (m = s)cn)” 
(1+m)’b 


(10) m(s) =t 


Total output is therefore 


a—(1+s)c, + sc; 
(1+m)b 


(11) 


am — (m — S)c, — SC; 
(1+m)b i 


and the equilibrium price is equal to 


(a+ c,(m — s) + cs) 


(1a) P= (1-+m) 


In an efficient solution, only the s low-cost firms would produce until price is 
equal to marginal costs. Hence, we can calculate the deadweight loss as 


DL*(s) = ‚e* — cı)((a - cı)/b — X*) 


which is equal to 


(13) DL*(s) =t 
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We can use the expression for the excess burden to establish a first result. 
Differentiating (13) with respect to s yields for alls<m and t>0 


(14) Rn. c„)((a oe s)(ch Gl 26, 


which implies 

Proposition 1: The deadweight loss is decreasing in the number of licences s. 

It is useful to refer to two special cases before we turn to a discussion of the 
general solution. 

All firms use the low-cost technology. In the case where all firms have free 
access to the innovation, quantities, profits, and the deadweight loss are equal 
to: 


P o a-c 
(15) x” (m) = Arab‘ 
* = (a ~~ cy% 
(16) n*(m) = Urma Fme 
(17) DL*(m) = (a = cı) (a + mc) ; 


2b(1 +m) 


High-cost firms drop out of the market. According to (7), the supply of the 
high-cost firms is equal to zero if and only if c, > ¢,(s) := (a+ ¢s)/(1 + s). In 
this case, s firms remain active in the market, and quantities, profits, and the 
deadweight loss are equal to: 


(18) xs.) =a yp 

(19) x*(s,@(5)) = n 

20) Drs ato) = Ne oe) 
2.2 Stage 2 


At stage 2 an innovator who has been successful in patenting his innovation 
has to determine the royalties he charges the firms who buy a license of the 
innovation as well as the number and identity of the firms to whom a license is 
sold. From the point of view of a single firm j that assumes that s — 1 firms pay 
royalties, paying royalties is rational if and only if 
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(21) mls) — a,(s — 1) —r; > 0. 
— 
:=Az(s) 


For simplicity we assume that the innovator can extract a fraction a € (0, 1] of 
the additional profit Az(s) as royalty from each firm buying a license. Hence, 
the only problem of the innovator is to determine the optimal number of 
licenses s. His optimization problem is 


(22) max asAr(s). 


Indispensable innovation. We start to analyze the solution of this optimization 
problem for the special case that the innovation is indispensable/perfectly 
complementary. With this we can concentrate on a potential efficiency problem 
of the patent system for the case of non-rival goods in the most focused way. 
In fact we can relax the assumption of indispensability by assuming that 
Ch > ©, (m). In this situation, Az(s) = (s). Treating s as a continuous variable, it 
follows that 


O(sAn(s)) _ _,(a= c) (s - 1) 


= os b +s) 


’ 


which is equal to zero if and only if s=1.! 

Proposition 2: If the innovation is indispensable, the innovator will sell only 
one license. 

In the light of Proposition 1 this solution demonstrates that the patent 
system as an incentive mechanism may be problematic from the point of view 
of economic welfare. At least in the case of an indispensable innovation the 
innovator has an incentive to restrict access to its innovation and to create a 
monopoly in the downstream market. The economic intuition for this result is 
straightforward: It is a standard result from oligopoly theory that aggregate 
profits in a Cournot-market are decreasing in the number of competitors. 

The general case. In the general case, setting the partial of (22) equal to 
zero and solving for s yields:? 


eye B 


It is easy to check that s* is increasing in c, m, and decreasing in c, This 
implies that the optimal number of licences is decreasing in the cost difference 


1 It is straightforward to check that this solution constitutes the global maximum of 
the optimization problem. 
? Again, it is straightforward to check that the second-order conditions are fulfilled. 
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(c, — c,) that can be realized by buying the licence. An analysis of (24) shows 
that the optimal number of s* is equal to m if and only if c, > c,(s* =m):=c, + 
2(a—c)/(3m). This condition is illustrated in Figure 1. 


Figure 1 Equilibrium number of licenses 


In the figure above, the area below the solid line defines all parameter 
values for which the optimal number of licences is equal to the number of 
firms operating in the market. The area above the solid line gives all 
parameter values for which the optimal number of licences is smaller than the 
number of firms. Both areas are divided by c,(s* = m), which crosses the 45°- 
line at c,=c,=a, the point where unit costs are equal to the maximum will- 
ingness to pay in this market. Hence, we get a generalization of Proposition 2. 

Proposition 3: If the innovation reduces unit costs by less than c,(s* =m) — 
C, the innovator will sell m licences. If unit costs are reduced by more than 
c,(s* =m) — c, the innovator will sell less than m licences. 

What is the economic rationale for this result? Selling a licence to an 
additional firm has two opposing effects. First it increases the profit of the 
innovator directly because an additional licence is sold. Second it changes the 
competitiveness on the Cournot-market because the number of low-cost firms 
increases. This reduces the profits of the other firms using the licence. If the 
potential for cost savings of the innovation is relatively small, the second 
effect is dominated by the first, whereas the opposite occurs for large cost 
savings. 

The result establishes an additional explanation for the inefficiency of the 
patent mechanism as a means to shape incentives for research. The standard 
literature on patents has focused on the deadweight loss created by the 
monopolistic holder of a patent if he is not able to perfectly discriminate 


136 Martin Kolmar 


prices between users of the patent. This source of inefficiency is absent in our 
model because the royalties paid by the firms using the innovation are sunk 
when they operate on the downstream market. The second argument dates 
back to Cournot (1838) and emphasizes the “anti-commons” problem if dif- 
ferent patents are complementary from the point of view of the potential users 
(see Buchanan and Yoon 2000, Depoorter, Parisi, and Schulz 2001, Shapiro 
2001, 2004). In this case, equilibrium prices tend to be higher than those set by 
a monopoly holder of the whole set of complementary patents. Our argument 
rests on the observation that (i) a number of innovations are process inno- 
vations that influence the competitive structure in a downstream market and 
(ii) innovations are non-rivalrous. Non-rivalry and efficient marginal-cost 
pricing imply that the innovation should be used as widely as possible from 
the point of view of economic efficiency. However, the patent holder may 
have an incentive to suppress access to his innovation in order to maximize 
profits. 

We conclude this section with an analysis of the optimal regulation of the 
patent system ceteris paribus that an innovation has occurred. If the normative 
criterion is to minimize the deadweight loss in the downstream market, two 
cases have to be distinguished. 

1. If s*=m, the duration of a patent ¢ is irrelevant with respect to the 
associated deadweight loss. £ has, however, an impact on the distribution of 
rents between the innovator and the downstream firms. Hence, if the innovator 
voluntarily sells m licences (the innovation reduces unit costs by less than 
c,(s* =m) —c)), the patent system has only an impact on the distribution of 
rents. It is therefore possible to vary ¢ in order to shape incentives for inno- 
vation at Stage 1 of the game without efficiency costs in the downstream 
market. This observation has important consequences for a comparison with 
the alternative contest mechanism. If both, patent as well as contest mecha- 
nisms, have in principle the same incentive effects and impacts on individual 
utility, the choice of a specific type of mechanism is irrelevant with respect to 
economic welfare. However, if a contest mechanism has idiosyncratic welfare 
costs or if individual incentives cannot be adequately shaped by this class of 
mechanisms, the patent system turns out to be superior. Hence, for innovations 
implying “small” cost reductions the burden of proof rests on the contest 
mechanism. 

2. If s*<m, the duration of a patent t has an impact on the associated 
deadweight loss: the longer t, the higher DL. Hence, if the innovation reduces 
unit costs by more than c,(s*=m)—c,, patents ceteris paribus reduce eco- 
nomic welfare. Extending ¢ in order to improve incentives for innovation at 
Stage 1 therefore has efficiency costs. Hence, if the incentive and utility 
consequences of the contest as well as the patent mechanisms are identical for 
the class of innovators, we have an argument in favor of a contest mechanism 
if the cost reductions of an innovation are sufficiently big. 
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By (10), the innovator’s profit from the royalties is equal to 


(25) s*r = goes =)+ mich — ci) = O,. 
8b(1 + m) 


Note that these profits are independent on the innovator’s investments /; at 
stage 1. 


2.3.1 Risk neutrality 

If the innovator is risk neutral, we can use y; as a measure of utility. At Stage 1, 
every potential innovator anticipates the potential profit ©; that results if his 
research leads to an innovation that can be patented. Hence, his optimization 
problem (4) becomes 


(26) u(t) = Yyıll)z + ph) Oi — bi. 
The derivative with respect to l; is equal to 

Ou; OY; Op; 
27 iz, i 
en Fe an 


First note that /;=0 if z=0 and ©;=(0, which means that incentives for 
innovations are neither provided by the patent system nor by a contest 
mechanism. In order to see whether it is possible to provide optimal incentives 
by an adequate design of both mechanisms, we first have to specify optimality. 

Optimality. In order to determine the optimal incentive scheme to promote 
innovations, it is necessary to characterize the conditions for an optimal sol- 
ution. We define optimality by the maximization of the expected sum of 
consumer and producer surpluses plus the sum of utilities of the innovators 
and start with a characterization of the first-best. 

Given that an innovation takes place, it is obvious that all m downstream 
firms shall have access to it for the whole time 7. The sum of consumer and 
producer surpluses is therefore S” = T(a — c,)/2b) — DL*(m). Utility of 
researcher i is equal to u;(y, 1) =y,—1;=—J,. In addition and to close the 
model we assume that z can be financed by a lump-sum tax imposed on the 
downstream market, which implies that z < Z := >, S”. Given risk neu- 
trality and additivity, this tax cancels from the equation that characterizes 
aggregate welfare: 


(28) WO) Ladys) -Yoh 


The first-best investments in innovation are therefore characterized by the 
following first-order conditions: 
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W Op; 
(29) oj _ Pi ro m pa = F . 


where we denote the optimal values of I by the superscript “o”. 
A comparison of (27) and (29) shows that it is necessary to induce optimal 
research incentives for researcher i to have 


(30) OY; o i Op; o — Op; o\ gm 


Assume first that z= 0. In this case, (30) becomes 
(31) =". 


However, $” is the maximum surplus that results from the innovation and is 
therefore always strictly larger than @,, the maximum profit of the innovator 
from licensing his innovation, even if the patent span is extended to its 
maximum T. 

Proposition 4: It is impossible to provide first-best optimal incentives to 
innovate by the patent system alone. 

Next assume that t=0. In this case, (30) becomes 


Oy. Op: 
(32) 0Y; o = OP; o m , 

ai, DS 
For the case of the Tullock CSF and in a symmetric equilibrium, the condition 
becomes 


Op; o\ gm 
E a” )S' 
“= n- 
wr 


(33) 


The right-hand side of (33) is a positive finite number. The feasibility con- 
straint z < Z imposes the additional condition /°(dp/dl(I°)) < (n — 1)/n. This 
leads to the following conclusion. 

Proposition 5: There exists exactly one positive and finite prize z for which 
efficient research incentives can be induced. A contest mechanism can be used 
to induce efficient incentives if I°(0p/0I(I°)) < (n — 1)/n. 

This in principle positive result depends crucially on the assumption of risk 
neutrality of the researchers. Any degree of risk aversion would make it 
impossible to implement the optimal allocation because individuals would be 
exposed to additional risk which would ceteris paribus decrease their expected 
utility. 
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2.3.2 Risk aversion 

In this section we will focus attention on the behavior of a single innovator 
who treats the other innovator’s choice of effort as exogenous. We set the 
complete analysis of the comparative static behavior of the Nash-equilibrium 
choices of l; aside with the associated technical complexity. For the case of risk 
aversion, the expected utility of an innovator is given in (3). Given incentive 
scheme {z, t}, the innovator will set /,(z, t) such that 


(34) Eull) _ 9 
al, 


Due to the strict concavity of o(.) and y(.) in l; and the boundary behavior of 
both functions, a unique solution exists. Inserting this solution into the 
expected utility function yields an optimal-value function ©(z, t). 

For every incentive scheme {z, t} and /,(z, t), the expected total welfare on 
the downstream market is equal to 


+(T-1):m- a" + plz, t))(t- CS + (T —t)CS") 
p(z, t))(T-m- a+ TCS) - (y(li(z, t) 
+A(1 — y(ii(z, t)))z. 


(35) 


| 

T 
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Given our focus on a representative researcher, z can have two different 
interpretations from the point of view of society. First, and this is the inter- 
pretation consistent with the idea of a contest, society has to pay z with 
probability 1 because it is only the relatively most successful researcher who 
will win the prize. In this case, A=1. Given that we restrict attention to a 
representative researcher in this section, it can also be interpreted as the 
contingent payment to researcher i that occurs with probability y(/;) from the 
point of view of society. In this case, 1 =0. We will differentiate between both 
interpretations in the following analysis because the results turn out to be 
sensitive with respect to the specific interpretation. CSS and CS” denote 
consumer surplus if s, m firms use the innovation, and zt, CS denote profits and 
consumer surplus if no innovation occurs. Furthermore it is assumed that the 
Prize z can be financed by means of a lump-sum tax. 

An optimal solution is again assumed to be characterized by the max- 
imization of the sum of the expected utility of the innovator and the welfare 
on the downstream market, ® (z, t) + W(z, t). If an interior solution exists, it is 
characterized by the following first-order conditions: 


ad W aD W 


(26) Oz | Oz Ot Ot 


0. 


0, 
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In order to better understand the structure of the optimal solution, we start by 
defining the set of {z, t} for which the innovator is indifferent. Totally differ- 
entiating ®(z, t) and applying the Envelope theorem it follows that this set can 
be described by the condition 


dt O®/0z Eju] /8z 
dz O®/dt ÖE[u] /Ot ” 


(37) 


The total differential of W with respect to z, t can be written as 


dW Wdl, Wdt W 


dz  ðl; dz ` ðt dz ` ðz` 


(38) 


The term dl;/dz can be approximated by 


ôl; P Ell /Ol,0z 
Oz PElul/OP ` 


(39) 


Using (37), we can write 


dw OW (AElul/öl,dz\ AW (AE{ul/dz\ | OW 
dz ol; ( 0 E[u] /Ol? ) Ot (Sao) 02° 


(40) 


(40) has the following interpretation: Considering only pairs {z, t} for which 
the expected utility of an innovator is constant, total welfare increases in z if 
(40) is positive. We will discuss the sign of all terms in turn. 

— The sign of OW/dl; is positive if the net-welfare on the downstream 
market in the presence of the innovation exceeds the welfare without inno- 
vation. Whether this is the case depends on the fraction of the profits that can 
be extracted by the innovator and the welfare without innovation. The con- 
dition is unambiguously positive if the innovation is indispensable. 

- The sign of (@E[uJ/ol0z)(C’E[uol?) is ambiguous in general. The 
denominator is negative because l; characterizes a maximum of the innova- 
tor’s optimization problem. It can be shown, however, that the numerator is 
positive for the class of probability functions p=/,/(l,+ D), y=1/(l;+E), 
D,E > 0, and the class of exponential utility functions, u(x) = x1, q € (0,1). 

— The sign of OW/ot is negative if the net-welfare on the downstream 
market in the presence of the innovation exceeds the welfare without inno- 
vation. Again, whether this is the case depends on the fraction of the profits 
that can be extracted by the innovator and the welfare without innovation. 
The condition is unambiguously negative if the innovation is indispensable. 

- The sign of (OE[u/0z)(OE[u)/Ot) is unambiguously positive because 
OE[uVoz =p(1—y)u'(z)+ yu (asta,+z)) and dE[ul/ot=asyx((1—p)v' 
(ast 1,)+ pv' (ast 1+ z)). 
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- Finally, the sign of OW/dz = — (y +A(1 — y)) is unambiguously negative. 


Given the level of generality of the model, it is impossible to derive more 
constructive results. We will therefore continue with a functional specification 
of the model that allows us to better understand the tradeoffs of the model. 


2.3.3 Functional specification 

In order to get a better intuition for the implications of the general tradeoff, 
we will use a functional specification of the model in the following. W.l.o.g. we 
specify the oligopoly market as follows: innovation is indispensable, and the 
parameters that characterize market demand and cost functions are a= 100, 
b=1, cœ =0, T=1. Recall that these specifications imply that for the time- 
span [0, ¢] the innovator sells one licence and that the resulting monopoly 
profit and consumer surplus is equal to aj-! = a? /4b, CS™ =a’/8b. For the 
time-span (t, 1] the oligopoly profit is equal to m)" = (m — 1)a?/(1 + m)?b, 
and consumer surplus is equal to CS" = m?’a’/(1+ m)?2b. 

In addition, we assume that the potential innovator has a utility function 
v(.) = JO . Furthermore, the probabilities to win the contest and to generate 
a patentable idea are perfectly and positively correlated, (1-£(.))y(.) = 
pC)A- y(.)) =0. For convenience we denote the joint distribution by p(.) in 
the following analysis and assume the functional form p(/;)=//(/;+1). The 
innovator’s expected utility simplifies to 


vy z + taa?/b — I;. 


Welfare on the downstream market, (36), becomes: 


Elull,2, )] = pUl + stam) == 7 


W(L,z, t) = p(l) - ((1-a)-t-a/4b+(1-t)-m- (m- 1)a’/(1-+m)’b) 
+ p(li)(t- CS + (1 — CS) — (e(,)z — Al — p(1))z), 
I; 


syg (A-a) em +A- m: a”) 


x l ; ao a2) ma 
L+1\ 86° (1 + m)’2b 
l; l; 
(i H(t Je): 


The maximization of the innovator’s expected utility with respect to /; results 
in an effort level of 


(4) l/t, z,m) = max{0, (250000: + z)'/* — 1}. 
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It is straightforward to check that the second-order conditions hold true. The 
expression demonstrates two things: first, the marginal expected revenue 
(measured in marginal utility) must exceed 1, the marginal loss of utility from 
an additional unit of effort. Second, patent rights and prizes are perfect 
substitutes from the point of view of the researcher. This latter finding is a 
direct consequence of the assumption that both risks are perfectly correlated. 
Inserting (41) into the expected utility function and the function measuring 
welfare on the downstream market gives rise to optimal value functions 
P{z, t), W{z, t). We assume again that an optimal solution is characterized by 
the maximization of the sum, of both terms, W(z, t) = ®(z, 1) +W(z, t). 
Contingent payment, 4 =Q: In this case we get 


W(z, t) = - (2500001 + z)'/*((250000r + z)'/* — 1) 


) 
(5000 (3m — 2)(1 — t)/(1 + m)” — 246250 t — z)((250000 + z)'* — 1) 
(250000¢ + z)" 


(42) 


+1 — (250000t + z)"*. 


An analysis of (42) yields the following result. 

Proposition 6: (1) For z=0, the optimal duration of a patent is positive, t > 0. 
(2) Without patent protection (t=0) the optimal prize is positive, z >0. (3) 
Numerical simulations show that the optimal prize-patent mix leads to z>0 
and t=0. 

Proof: 

ad 1: differentiating (42) with respect to t, setting z =0 and evaluating the 
resulting equation at t=0 yields 


. |W ; 1 
sign u . —sign = (2 3m)m| ; 


which is unambiguously positive because m > 1. 

ad 2: Differentiating (42) with respect to z, and setting t=0, we get from 
(41) that /;= max{0, z'4— 1}, which is positive only if z > 1. Hence, for all z < 1 
an increase in z has no impact on incentives while increasing costs. However, 
for z>1 we get 


sign u = sign [—1 + m(—10002 + 14999 m)], 
z=0,t=0 


which is unambiguously positive because m>1. Obviously, W(0, 0)=0. 
Hence, it remains to be shown that W does not converge to r < 0 for increasing 
z, which is straightforward to prove. 
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Figure 2 Welfare-indifference curves for different levels of (z, t) with proba- 
bilistic costs (darker shades = lower level of welfare). 
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ad 3: Figure 2 provides the intuition behind this result. 

In the figure, ¢ is plotted along the abscissa and z is plotted along the 
ordinate for m= 100. Each graph represents the locus of welfare-indifference 
curves. It can be seen for ‘small’ values of z, there exists an interior local 
maximum for t, whereas the local maximum for t is equal to 0 for larger values 
of z. The global optimum is at a point {z, 0},z > 0. Unfortunately it has been 
impossible to derive a closed proof of this result, but it turns out to be robust 
for all values m=1, 2, ..., 200 for which a simulation has been run.’ 

Part 3 of the proposition highlights that even in the presence of risk aver- 
sion the economic costs of the contest mechanism can be dominated by the 
welfare costs of an inefficient licensing-policy that exists with a patent system. 

Non-contingent payment, 2 = 1: In this case we get 


3 Details of the simulation will be provided by the author upon request. 
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W(z, t) = — (250000 t + z)"’*((250000 + z)'/* — 1) 
(1250 (197 t + m)(8 + 386 t + m (209 1 — 12)))(1 — (250000 t + z)"’*) 
(1 + m)’(250000 1 + z)" 
+1 — z - (250000: + z)”*. 


(43) 


An analysis of (43) yields the following result. 

Proposition 7: (1) For z=0, the optimal duration of a patent is positive, 
t>0. (2) Without patent protection (t=0) the optimal prize is positive, z>0. 
(3) Numerical simulations show that the optimal prize-patent mix leads to z = 
0 and ¢>0. 

Proof: The proof of parts (1) and (2) are similar to parts (1) and (2) in 
Proposition 6. Figure 3 provides the intuition behind part 3. 


Figure 3 Welfare-indifference curves for different levels of (z, t) with definite 
costs (darker shades = lower level of welfare). 
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As in Figure 2, ¢ is plotted along the abscissa and z is plotted along the 
ordinate for m= 100. Each graph represents the locus of welfare-indifference 
curves. It can be seen that for ‘small’ values of t, there exists an interior local 
maximum for z, whereas the local maximum for z is equal to 0 for larger 
values of t. The global optimum is at a point {0, t}, t> 0. As before, it has been 
impossible to derive a closed proof of this result, but it turns out to be robust 
for all values m=1, 2, ..., 200 for which a simulation has been run.* 

Part 3 of this proposition highlights that there exists no clear evidence in 
favor of or against a contest mechanism or a patent mechanism. The sig- 
nificance of the idiosyncratic transaction costs differ dependent on the allo- 
cation problem at hand. 


3 Conclusions 


In this paper we have focused attention on the tradeoff between two types of 
mechanisms that can be used to induce incentives for scientific research, the 
patent and the contest mechanism. The relative transaction-costs of both types 
of mechanisms are a result of (a) the incentives of a patent holder to sell 
licences and thereby influence incentives on a downstream market and (b) the 
additional innovator-specific risk generated by a contest. It has been shown 
that the optimal licensing policy of an innovator tends to be suboptimal if the 
cost-reduction of the innovation is relatively large. In this case, reducing 
access to the innovation is profit maximizing for the innovator. If the cost 
reduction of the innovation is, however, relatively small, the innovator has an 
incentive to sell the optimal number of licences. In the latter case, an increase 
in the term of the patent merely shifts rents from the firms on the downstream 
market to the patent holders. 

A comparison of the patent and the contest mechanism as means to induce 
optimal incentives to invest in research must identify the transaction costs of 
both types of mechanisms. In the case of risk neutrality of the innovators it 
follows that optimal research incentives can be induced by an adequately 
designed contest as long as the necessary prize does not exceed the budget. It 
is, however, impossible to induce optimal incentives by the use of the patent 
mechanism because the patent holder can only participate in the surplus the 
licence generates for a firm. This surplus marginally (which is relevant for 
incentive design) differs from the social surplus generated by the innovation. 

The optimal incentive structure is very complicated when risk aversion of 
the innovators is taken into consideration, and no clear-cut results can be 
derived. This result is not very surprising, given the anything-goes results from 
the literature on contest behavior with risk-averse bidders (see for example 


* Details of the simulation will be provided by the author upon request. 
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Cornes and Hartley 2003, Konrad and Schlesinger 1997, Skaperdas and Li 
1995). However, we have shown for a specific example that in a second-best 
world with inefficient licensing incentives, extreme solutions can be second- 
best optimal. Depending on the specific structure of the model it can turn out 
that the patent mechanism dominates the contest mechanism and vice versa. 
The results of this section therefore raise more questions than answers are 
given. 
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Scientific Competition: 
Beauty Contests or Tournaments? 


Comment by 


ROLAND KIRSTEIN 


1 Basic idea of the paper 


Pure public goods are characterized by non-rivalness in consumption (the 
marginal costs of serving additional consumers are zero) and by non-exclusion 
(consumers who do not pay for the good may also have access). While non- 
rivalness is a natural and unavoidable property of information goods, this is 
not necessarily true with regard to the exclusion principle. In his paper, Martin 
Kolmar points out that it is a matter of choice whether the exclusion principle 
applies to information goods. This is the case if society establishes patent right 
protection. Thereby, society decides whether information goods are pure 
public goods or club goods. This decision should take into account the 
respective welfare outcome in a positive transaction cost world. From a 
property rights point of view the question could be restated: Should the 
property rights to a specific information good be in the hands of a single 
decision maker (this would establish a club good), or should it be in the hands 
of all members of society (this would turn it into a pure public good)? 
Society faces two stylized types of mechanism to provide incentives for 
potential innovators: the contest and the patent. A contest of the Tullock type 
induces the players to compete for a prize which is awarded according to their 
relative effort. If innovations are generated, they can be used for free (no 
exclusion). In a patent system, on the other hand, a patent right and the 
license fees generated thereof are introduced as the prize for the successful 
innovator. To University researchers, contests appear to be very familiar. It 
seems to be the archetype of scientific competition. Researchers are less inter- 
ested in filing for patents, as their main interest is directed towards prestigious 
publications, well paid chairs, and the access to research funds. If, however, an 
innovation has been published in a scientific journal, it cannot be patented 
anymore (according to German patent law), as it is already publicly available. 


2 Political relevance 
Kolmar’s paper contributes to an ongoing debate in Germany on the abol- 


ishment of the “professorial privilege” in the German patent law. A few years 
ago, the federal government initiated a reform of the Employees’ Inventions 
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Act to abolish the “professors’ privilege”. Until then, the choice between 
contest and patent, as described by Kolmar, was left to the professors them- 
selves. Having made an invention, a professor could choose whether to file for 
a patent (and collect 100% of the royalties) or to publish his results imme- 
diately, thereby forgoing the patent. The federal government, however, 
claimed that the number of patents filed by German universities was “too 
small”, and therefore deprived the university researchers of their privilege. 
From now on, an invention belongs to the researcher’s university (unless the 
employer rejects it). The university is supposed to install a professional patent 
management and may file for a patent; the researcher is eligible for only 30% 
of the royalties.! 

Kolmar’s model provides an explanation why it may have been beneficial 
not only for the professors, but also for society not to employ the patent 
system. The researchers were compensated with a scientific career, while 
society gained free access to their inventions (only the universities were left 
empty-handed). The federal government seems to have completely over- 
looked this. To the contrary, it considered the existence of a functioning patent 
rights system a prerequisite for inventions to have economic value. Moreover, 
the former German government seems to have been unaware of the fact that 
prices and value are different economic concepts. A patent right system may 
be a prerequisite for an invention to have a positive market price. Without 
exclusion, the market price will be zero, but the invention can still bear value. 
Even worse, a positive price for a non-rival good would create an ex post 
inefficiency. 


3 Discussion of Kolmar’s results 


By choosing patent right protection, the exclusion principle applies and a 
patent holder may charge a license fee from other users of the right. Kolmar 
shows that a patent holder has an incentive to restrict the oligopolistic com- 
petition by selling a smaller than efficient number of licenses if three con- 
ditions are met: 

— the patent holder competes in a Cournot oligopoly, 

— the innovation in question is one that decreases marginal costs, 

— and this cost differential is relatively large. 


The resulting welfare loss, however, has to be compared with the one gen- 
erated by the best alternative incentive mechanism. When comparing contests 


! The economic analysis demonstrates that the new law may miss its goal to increase 
the number of filed patents; see Kirstein and Will (2006) and Will and Kirstein (2004) 
with an overview of the relevant literature. 
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and patents, society should not condemn one mechanism on the grounds that 
it fails to implement a first-best outcome, but rather look for a second-best 
solution. For risk-neutral inventors, Kolmar derives that the contest mecha- 
nism dominates the patent system. With risk-averse inventors, the results are 
not as clear cut. 

Kolmars paper highlights incentive and welfare aspects of the scientific 
production process which prove relevant for the recent political debate in 
Germany. However, some objections deserve to be discussed. The first 
objection relates to the idea that only “contest” is a tournament, but not 
“patent”. Kolmar models unrelated inventions. In reality, however, the patent 
system may also be characterized as a rank-order tournament if several 
researchers explore products or technologies which are substitutes. The 
extreme case would be a patent race between researchers pursuing the same 
goal. In such a situation the prize is only awarded to the first inventor. Several 
patent race models exist,? but the Tullock formula may also serve this purpose. 
The common wisdom is that patent races lead to overinvestment and, there- 
fore, are inefficient. 

The second objection may question whether the “contest” between 
researchers is actually a rank-order tournament. It is a typical property of a 
tournament that only one prize is used to motivate a group of agents. How- 
ever, researchers face the prospect of more than one chair or research fund. It 
may well be the case that several competing candidates all receive a prize. 
Moreover, it can pay to heterogenize, i.e., to deviate from a strict competition 
and specialize into an idiosyncratic direction. This may put a researcher into a 
better position, compared to his competitors, when pursuing a specific prize. 
In other words, equilibria can be asymmetric. If, however, “patent” may also 
be a tournament and “contest” is perhaps a much more complicated tour- 
nament with multiple prizes and asymmetric specialization, then the com- 
parison of the two mechanisms becomes more complex. 

A last objection may challenge the author’s view that the Tullock model of 
rent-seeking actually provides a realistic and correct description of the com- 
petition for research funds or tenured positions. What happens in reality is 
that competitors present their research agendas, and evaluation committees 
choose the most impressing or promising one. Previous effort may play a role 
in convincing search committees or referees who decide upon research funds. 
But the main criterion for its decision is not the past achievements, but the 
prospects of the candidate or his ideas. 

These three objections try to make clear that the difference between 
“contest” and “patent” is not as clear cut as it was described by Kolmar’s 
model. So what are the differences between the two systems? One difference 
lies in the respective system’s ability to discover decision errors. 


? One example is found in chapter 14 of the textbook by Rasmusen (2001). 
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Scientific competition is characterized to a large extent by ex-ante evalu- 
ation, i.e., they are multiple “beauty contests”. Candidates may present their 
plans to several committees or single referees, who decide about awarding one 
of the available prizes, e.g., a chair or a research grant. For the successful 
candidate, being awarded a prize implies the access to resources which can be 
used for either the production of scientific output (research grants, chairs), or 
for its presentation (publication space, conference slots). In both cases, it may 
turn out only later whether the winner has actually produced something 
valuable, either because the production of knowledge only takes place later, 
or because the published work will be cited. 

Ex-ante, the decision-makers are uncertain about the respective merits. 
Normally, such an evaluation system is characterized by alpha- and beta- 
errors: the selected candidate may perform worse than expected, and a 
rejected candidate might have produced a more valuable invention. In 
extreme cases, when the loser receives no opportunity to pursue his research 
agenda at all, society will never learn whether the losing candidates would 
have produced something more valuable. In such a system, an alpha error (the 
selected candidate has failed to produce anything valuable) can be detected 
later on, while it would be impossible to discover a beta error. 

In the patent system, an ex post evaluation of achieved innovations takes 
place. The task of the evaluation authority is limited to determining which one 
of the contestants is the actual winner (the first, the best, the most original), 
and thereby is granted a monopoly right to use the innovation/invention. This 
decision can even be made subject to judicial review. It is not excluded that 
unsuccessful projects, which are not awarded the patent right, may later turn 
out to be actually superior. Both alpha- and beta-errors can be discovered 
later, and therefore the quality of the system can be controlled better. 
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The Role of Patents in Scientific Competition: 
A Closer Look at the Phenomenon of Royalty Stacking 


by 
CHRISTINE GODT 


1 Introduction 


Recently, patents have become both, a product of scientific research and a 
measure of performance and excellency. Prior to this, patents were confined to 
industrial development within the market vicinity — aimed at keeping the idea 
secret inside the corporation as long as possible until the commercialisation of 
the end product begins. In contrast, basic science was perceived as a separate 
counterpart to applied science and defended as a patent-free zone. Scientific 
performance in basic science was conceived as reputation measured by pub- 
lications. Today, in the field of natural sciences, patents have supplemented 
publications and citations as an indicator of reputation not only of individual 
researchers but also of scientific institutions. This development is highly 
contested in respect to its impact on basic science. Do patents impede or 
promote science, and in which ways? Will they accelerate research or slow it 
down? What kind of incentives do they provide for researchers and their 
home institutions? When patents found their way into the scientific realm in 
the 1980s, opponents raised concerns that researchers would hold back their 
results, publish less or later and refuse the exchange of knowledge and 
material. In the 1990s, concerns were raised that patents would proliferate, 
thus stifling research and development.! Proponents would claim that patents 
foster scientific competition,” that they set an incentive for individuals to 
invent and for institutions to invest, thus resulting in more innovation. 

In the meantime, the debate has become more sophisticated. There is evi- 
dence that scientists in private and in public research do both, patent and 
publish (Stokes 1997, Agrawal and Henderson 2002, Murray and Stern 2005). 
The long-perceived tension between patenting and publishing does not seem 
to exist, at least not sharp and measureable. Empirical evidence suggests that 
access is more willingly granted to patented knowledge than to material 
(Walsh, Cho and Cohen 2005). Access problems persist in research on clinical 


! This discussion is known as the “anticommons debate” — an inversed reference to the 
famous article “Tradegy of the Commons” by Hardin (1968). The parallel was first drawn 
by Heller (1998). The debate of how to evalute the process is still ongoing: Is patent 
protection “too strong” (inter alia Eisenberg 1996a, David 2004) or “too weak” (Heller 
1999)? 

? For the US see, e.g., Nelson (1998), Walsh, Arora and Cohen (2003); for Germany, 
e.g., Hoeren (2005). 
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diagnostics, suggesting that problems occur when research is closely related to 
(or being itself) a commercial activity.” Overlapping claims, e.g. related to 
DNA, make it difficult to know one’s own rights and those of others (Ver- 
beure, Mattijs and Overwalle 2005). Special attention is paid to the problem of 
patented research tools.* Consent is growing that patents in science do not 
function in their traditional sense as incentives for the individual researcher to 
invent. Researchers respond stronger to other incentives (Agrawal and 
Henderson 2002). Former high-income expectations of research institutions 
through patenting and licensing have not been fulfilled, at least not for the 
average university. Instead, it has become evident that patents play different 
roles for different actors. In industry, beyond the traditional function of 
competitive exclusion, patent protection for scientific research results serves 
two different functions. First, patents commodify information and thus secure 
the transfer of information between internationally decentralised entities. 
Second, as patents can be purchased, formally intramural research can be 
outsourced and re-aquired in a contract-based transaction. In other words, 
patents are essential for the transfer of knowledge between contractors and 
the firm. For research intensive, small biotech companies, patents serve to 
attract venture capitel. For universities, other functions prevail: Patents pro- 
vide benchmarks for ingenuity and high performance, thus enhancing publi- 
city and profile. Increased international cooperation in every form, between 
scientists and industry* and between scientists across borders,° has instigated 
the claiming of intellectual property rights.’ Patents can help to establish start- 
up companies, thus providing career opportunities for graduates.® For policy 
makers in industrialised countries, two functions are important: First, a high 


3 Merz, Kriss amd Leonard et al. (2002), Walsh, Cho and Cohen (2005) - then, patent 
holders are more likely to assert and researchers are more likely to abandon infringing 
activities. 

* The public discussion about research tools (see for the US: National Research 
Council 2005, Gewin 2005; for the UK: Nuffield Council on Bioethics 2002) has given 
rise to much research (legal, economic and econometric), see Eisenberg (2000), Holman 
and Munzer (2000) on the one hand highlighting problems, and Walsh, Arora and Cohen 
(2003) on the other hand aiming at appreasing and structuring the debate. 

5 See the rationale of the 6" EU Framework Research Programme (recital 1 of the 
Decision No. 1513/2002/EG from 27 July 2002, Off. J. 1 232/1) and the rationale of the 
funding policies of the German Research Ministry in: Richtlinien fiir Zuwendungsan- 
träge (BMBF-Formular 0027/01.03, available at http://www.bmbf.de). 

6 See the contributions in Edler, Kuhlmann and Behrens (2003), see also the 
descriptions of Knorr-Cetina (1999). 

7 In the case of science-industry collaboration, it is the industrial partner who usually 
has an interest in proprietarily secured knowledge; empirical evidence for the correlation 
between industry involvement and patent applications of research institutions is pro- 
vided by Carayol (July 2005, 5 and 13). In the case of science-science collaboration, it is 
the scientists themselves who are interested in securing their rights to material and 
knowledge in order to protect their own future research opportunities. 

8 Or can provide additional pension payments-as suggested by Carayol (2005, 14). 
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patent standard serves as an instrument in global regulatory competition to 
attract industry, because innovative, high technology firms tend to prefer 
countries with a high patent standard. Second, patents are meant to enhance 
the transfer of knowledge from science to industry, thus securing long-term 
innovation and growth. Therefore, public policy has fostered the collaboration 
of science and industry, most prominently by funding schemes, and supported 
the move of patent protection into basic science.” 

The following article focuses on the patent function of technology transfer 
and will only cover the technology transfer from basic science to industry. At 
its center is the question whether there is a causal link between patents in 
basic research and technology transfer to industry — as often claimed. Thus, it 
will neither analyse the much debated impact of patents on scientific research 
behaviour per se,® nor will the incentive for the individual researcher be 
discussed. The article is less interested in the behavioural incentive of patents 
to invent than in the institutional effect of patents on technology transfer. 
Thus, it complements the broad debate about the effects of patents in science 
by providing an additional perspective. It takes patents on scientific results of 
public research institutions as a given fact, but asks about the commercial 
logic underlying the assumption of the causal link. It contributes to a better 
understanding of the functions and different roles fulfilled by research insti- 
tutions. The modern university systems, especially in Europe, is characterised 
by a mixture of competition and cooperation which conventional economic 
approaches are not easily applied to.!! The article raises the question if a 
patent is a decisive sine qua non condition or just one enhancing factor among 
many others that instigate technology transfer. Are they important in some 
sectors, less important in others? Are they beneficial in some, but detrimental 
in others? 

The article focuses on the counterintuitive phenomenon of “royalty stack- 
ing”. This expression describes the problem of accumulating royalty promises 
in the research process which results in an ever decreasing profit margin until 
the research result is “ready” to be transferred to the process of product 


° Funding rules require researchers to secure intellectual property rights in their 
research results. Technology transfer offices are fostered, in Germany as an integral part 
of the patent reform that abolished the so-called professor’s privilege in 2002. This 
provision had assigned their inventions to them personally. By now, all inventions can be 
claimed by the university or research institution. 

10 A lot of research has been done in respect of how scientific research has changed 
under the influence of the hybrid incentive structure of traditional norms and com- 
mercial incentives, see only Godt (2007, Chap. 3), v. Overwalle (2006), v. d. Belt (2004), 
Rai and Eisenberg (2003), Heller and Eisenberg (1998), Blumenthal et al. (1997). Until 
today, the legal discussion has revolved around the question how science can be shielded 
and whether the given instruments are sufficient, especially the so-called research 
exemption in patent law Galama (2000), Holzapfel (2003), Godt (2007, Chap. 6). 

11 Mowery and Sampat (2005, 233) describe this analytical lacuna. 
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development. Therefore, the phenomenon threatens the very idea of tech- 
nology transfer from science to industry. It is counterintuitive because it 
contradicts the very assumption that property rights result into the most 
efficient distribution of ressources. Therefore, the analysis of the phenomenon 
of “royalty stacking” may help to understand the conditions required for 
technology transfer to happen, but may also improve our understanding of the 
boundaries beyond which the dynamics of the patent system are more detri- 
mental than beneficial to basic science — and in the long run to industrial 
prosperity and to society as a whole. 

The article proceeds as follows. First it describes the phenomenon and its 
generation (2). It then puts the phenomenon into the broader context of 
technology transfer in the information society (3). Taking these considerations 
into account, it portrays some possible policies for the various actors involved 
(4) before drawing some final conclusions (5). 


2 “Stacking Royalties” 


The expression “Stacking Royalties” describes the “problematique” of accu- 
mulated negotiated royalties by researchers in the subsequent research 
process. If the profit margins for the commercial developer have already been 
used up before the developer comes into play, technology transfer from sci- 
ence to industry will not happen. The patent attorney Philip Grubb estimated 
that a royalty accumulation of 20% is the limit for transfering the research 
result to the industrial process of product development.’ 

There are two causes for the accumulation of royalty claims, one being 
proprietary, the other being contractual. The proprietary cause is at the heart 
of the patent system. Problems with this type of accumulation are in built and, 
until today, dealt with either statutorily or in corporatist ways. However, 
problems occur in the modern science system because these practical 
mechanisms are not available to research institutions and because the ever 
broadening scope of patent protection affects science in particular. The con- 
tractual cause is the one that gives rise to yet unresolved challenges for sci- 
ence. Both are mutually reinforcing. 


12 Oral presentation during the workshop on “Genetic Inventions, Intellectual Prop- 
erty and Licensing Practises”, organised by the German Federal Government (BMBF) 
and the OECD, 24/25 January 2002 in Berlin. 
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2.1 Property 


For the sake of analytic precision, “proprietary royalty stacking”, first, needs 
to be distinguished from “stacking patents”. The latter, technically called 
dependency, is the central patent mechanism. 


2.1.1 Linear dependency distinguished 
Dependency describes the “stacking of patents” (not royalties). It is the key to 
the patent system as it upholds the incentive to invent during the process of 
continuous progress. It makes the patent the strongest form of intellectual 
property in comparison with copyright or plant varieties. First of all, the 
patent provides an incentive to any innovator by granting him/her a time- 
limited monopoly.!? However, any further improvement, in principle, has the 
potential to destroy the economic value of the former innovation before the 
patent expires. This is what Schumpeter (1942) called “the process of creative 
destruction”. Therefore, in order to uphold the incentive to innovate in the 
pursuit of progress, the system links initial patents to subsequent patents of 
follow-on innovators. The idea is that although the subsequent invention is 
“novel”, “non-obvious” and “inventive” and thus patentable on its own, this 
patent is still covered by the scope of the basic patent.'* The legal con- 
sequence is that neither the base patent holder nor the improver are allowed 
to use the invention of the other unless authorised by a negotiated license. 
This mechanism creates mutual blocking rights and enables the pioneer 
inventor to reap some of the benefits of subsequent improvements. 
Dependency provides the balance between the incentive for the pioneer and 
the incentive for improvers.! In principle, dependency does not result in 
royalty stacking. If one patent builds on a previous one (linear dependency), 
any follower can promise a share of his/her own profits when using a former 
invention. Privious royalty promises can only be for shares of this promise; 
thus they do not accumulate over time. 

For applied industrial research, linear dependency has not yet caused 
insurmountable problems (Kowalski and Smolizza 2000). Although history 


B However, time limits differ considerably. Patents have a maximum livespan of 
twenty years after first application (although less than half are prolonged after 10 years 
by their owners). Copyrights usually last seventy years after the death of the creator. 

14 For the dogmatic distinction between “novelty” of the inventive idea and “breadth 
of a patent scope” which form the basis of dependency in patent law, see Godt (2003, 11), 
Godt (2007, Chap. 7). 

15 Merges (1994); for an economic description of the equilibrium between sufficiently 
strong incentives for the pioneer and the improvers, see Scotchmer (2004). 

16 Although, unsurprisingly, the definition of the „right balance“ is highly contested. 
On the quest for a broad patent scope for the pioneer see, e.g., Kitch (1977), on the quest 
for sufficiently large incentives for the innovators see, e.g., Nelson (2000), Merges (1996), 
Scotchmer (1991). 
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has witnessed situations of blockage in the optics and the aviation industry 
(Merges 1994, 1996), choosing between the exclusion of competitors and 
granting a license is a business decision geared by strategic considerations.” 
The hightened concern about rising transaction costs in patent litigation 
(Fischermann 2005, Kanellos 2005) led economists and lawyers to advise the 
tightening of patentability requirements (e.g., Merges and Nelson 1990, Bar- 
ton 2001, 881) by the internal reorganisation of patent offices (Moufang 2003, 
Straus 2001b, Barton 2000) or by third party review.'® Besides, ignoring 
infringements is as widely known” as (non-infringing) parallel developments 
(Scotchmer 2004, 140ff.). Under the threat of compulsory licenses and anti- 
trust motions, industry has usually been willing to find arrangements, pref- 
erably via cross-licensing. As a consequence, dependency has until recently 
attracted little academic attention beyond the field of self-reproductive 
material.” 

Problems occur, however, when a patent depends on too many previous 
independent patents (“property rights complex”) (2.1.2) and when too many 
further developments depend on one basic patent (2.1.3). 


2.1.2 Dependency on too many patents: The “property rights complex” 

The problem of dependency of one patent on too many parallel patents and 
the resulting royalty stacking is not a new one for industry and is dealt with 
under the heading of “property rights complex”. The profitable development 
of an end product is put at risk when too many employees of different firm 
sections claim a share of the profits from a new (typically assembled) product. 
In Europe, this problem is explicitly dealt with in remuneration rules for 
employee inventions in private firms and in public service.*! As an annex to 
the law governing employee inventions (German: Arbeitnehmer- 


17 Although the strategic use of patents puts some pressure on the system, see Barton 
(2000, 2002), European Commission (2003). 

'8 Either envisioned as an administrative (Jaffe and Lerner 2004, 22) or a judicial 
procedure (Lemley 2001). 

1 Schmidtchen (1994, 37), notes two examples: the un-licensed production of light 
bulbs by Philips and the un-licensed production of plant-oil based butter (margarine) by 
Jurgens and van den Bergh (later Unilever), both resulting in a market-dominating 
production. 

2 The classic example is the sui generis system of plant varieties, for a concise historic 
account with an outlook on modern biotechnology see Winter (1992) and Straus (1987). 

2! In Germany: “Richtlinien für die Vergütung von Arbeitnehmererfindungen im 
privaten Dienst” (RLArbnErfprivD) 20 July 1959 (Bundesanzeiger Nr. 156 v. 18. Aug. 
1959), version 1 Sept. 1983 (Bundesanzeiger 1983, 9994). Pertaining to inventions of 
employees in public service according to “Richtlinien für die Vergütung von Arbeit- 
nehmererfindungen im öffentlichen Dienst” of 1 Dec. 1960 (Bundesanzeiger Nr. 237 
from 8 Dec. 1960), enacted as Executive Order of the Minister of Labour after con- 
sultation with representatives of employers and employees, based on $ 11 ArbnErfG; 
printed in Bartenbach and Volz (1999, 2002). 
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erfindergesetz, ArbNErfG), No. 19 of the German remuneration guidelines 
holds that the value of the whole complex shall be evaluated if a process or a 
product uses a number of prior inventions.” This value (in practice usually 1 
to 3% of expected profits) is to be shared by all previous inventors — taking 
each contribution to the whole into account. Disputes are settled by an 
arbitral body (“Schiedsstelle”) (§ 29 ArbnErfG). 

This rule builds on the concepts that each employee is entitled to his/her 
invention although he/she is paid for making inventions. Technically, only the 
employer has the right to claim the invention. If the invention is claimed, 
compensation is due to the employee. This system, installed in Germany in the 
1930s, has come under pressure due to the bureaucratic burden for the 
employer and the risk to miss the four-months deadline (§ 6 sec. 2 ArbnErfG). 
A national draft reform proposal aims at making the system easier. It proposes 
the removal of the deadline and of the instrument of the employer to claim the 
employee’s invention (“Inanspruchnahme”). Also the remuneration system is 
to be simplified. Instead of a share in profits, the employee shall only be entitled 
to lump sums, with additional royalty promises remaining optional.” 

In the scientific environment, things differ in three aspects. First, as one 
single innovative development is usually not confined to one institution, the 
corporatist mechanism of evaluating “the whole” is not available to a research 
institution. Typically, dominant patents are owned by a plurality of research 
institutions. Second, the problem is exacerbated especially in molecular 
biology by the necessity of using a large array of research tools. Third, 
according to German law, university scientists are entitled to 30% royalties 
(§ 42 No. 4 German ArbNErfG). 


2.1.3 Too many dependant patents: The inverse “property rights complex” 
Problems also occur when too many patents depend on one base patent. This 
is the problem that has prompted the lively debate about anticommons.”* Base 


2 “Schutzrechtskomplexe” Nr. 19 RLArbnErfprivD: “Werden bei einem Verfahren 
oder Erzeugnis mehrere Erfindungen benutzt, so soll, wenn es sich hierbei um einen 
einheitlich zu wertenden Gesamtkomplex handelt, zunächst der Wert des Gesamtkom- 
plexes, gegebenenfalls einschließlich nicht benutzter Sperrschutzrechte, bestimmt wer- 
den. Der so bestimmte Gesamterfindungswert ist auf die einzelnen Erfindungen auf- 
zuteilen. Dabei ist zu berücksichtigen, welchen Einfluss die einzelnen Erfindungen auf 
die Gesamtgestaltung des mit dem Schutzrechtskomplex belasteten Gegenstandes 
haben.” 

> For a critical economic analysis see Will and Kirstein (2004). Kirstein and Will 
(2004), arguing that the profit share is less efficient than a bonus contingent on the 
project value. 

% The anticommons debate as a discussion about “the right patent scope” has dis- 
placed the formerly more popular questions with economists about the optimal time 
length of patents (Merges and Nelson 1990, Scotchmer 1999 and the differentiation of 
patent protection between industries Lemley 1997). 
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patents which are too broad might block research and competing develop- 
ments, following (dependent) patents might be too narrow to be economically 
useful and therefore poison the system by increasing transaction costs and 
make research more expensive. However, at first glance, the growing number 
of dependent patents does not instigate the stacking of royalties — the focus of 
this article. On the contrary, the smaller the scope of patents becomes, the 
smaller is the chance that other patents will depend on them. 

A closer look reveals something else: Not only does the broadening of the 
patent scope increase the amount of improvements covered by the scope of a 
prior patent. The growing scope creates the often deplored “patent thicket” 
(Shapiro 2001) of overlapping claims. This problem is most virulent in mo- 
lecular science when a nucleotid sequence or a gene sequence is covered by 
more than one patent (Jensen and Murray 2005, 240), but it also troubles the 
information industry (David 2000). It was originally dealt with by the outright 
exclusion of discoveries and theories. With the move of the patent system to 
cover research results and information, especially in the fields of bio- 
technology and information technology, this “easy solution” has been 
blocked.* Problems, formerly crowded out by the discovery/invention dis- 
tinction, seriously threaten the functioning of the patent system.” And they 
also instigate dependencies which result in the accumulation of royalties.’ 

The discussion about the right definition of patentable subject matter 
(technically the distinction between invention and discovery), in principle, is 
an old debate about the proper balance between a sufficiently strong incentive 
for the inventor and the sufficiently broad leeway for improvers. The concepts 
were transposed to modern science by the economist Suzanne Scotchmer 
(1991) in her seminal paper.” She holds that “sequential innovation” is a 
specific characteristic of the modern science system. She re-defines modern 
scientific progress in ways that were formerly enshrined in considerations on 
the exclusion of discoveries and theories from patentability. Thereby, she 
inspired the modern debate about the right scope of patents and problems 
which are due to patents being either too numerous and too narrow or being 
too broad and thus impeding subsequent developments.” 

Yet, this discussion is dominated by a discourse about access rights to 
research results for scientists. The perceived problem is the exercise of 


> Bearing in mind that the distinction between discovery (theories) and invention has 
always been conceived as an “entry” qualification to the patent system rather than a 
semantic definition. See for the historic example of the chemical dye industry v. d. Belt 
(1992); for modern biotechnology Straus (2001a), Godt (2007, Chap. 2). 

% For a considered analysis of scientists not known as radical critics of the patent 
system see Cornish (2004); also the contributions in Dreyfuss, Zimmerman and First 
(2001). 

27 Seriously considered as a problem also recently by Jensen and Murray (2005, 240). 

238 Scotchmer (1991), later finetuned in Green and Scotchmer (1995). 

> See the “anticommons debate” (Will and Kirstein 2004, Kirstein and Will 2004). 
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exclusion and the rising costs of research. Therefore, reflections aim at 
shielding science from the exercise of patents via a broad research exemption 
(Eisenberg 1987, Barton 2000, Gold, Joly and Caulfield 2005) or via access- 
securing compulsory license type mechanisms.*” These solutions would also 
ease the problem of stacking royalty promises that follow from licensing. 
However, with research institutions becoming normal commercial partners 
and scientific patenting becoming an everyday phenomenon, research 
exemptions and compulsory schemes will continue to be narrow and rare.°! 
Therefore, the problem of royalty stacking will also remain unresolved. 


2.2 Contracts 


2.2.1 The beast of the knowledge society 

The second mechanism for royalty accumulation are contracts. Contractual 
arrangements can even be more intricate than the property mechanism. The 
latter only functions when a patent is technically dependent on a plurality of 
prior patents. Thus, only “using” a patented method in research without 
making it part of the new patented invention will seldomly result in a veto 
right or in a claim to royalties. However, contract clauses might “reach 
through” the use of the patent to future patents to be created (or future 
contracts) by stipulating that the owner of the patented reseach tool is entitled 
to royalties from those patents that will only result from using this research 
tool.” This can result in stacking royalties. 

There are various reasons for the owner of an intellectual property to 
negotiate such clauses. Evidently, it helps to keep track of the market. 
Tracking future dependent patents is difficult. More important is that infor- 
mation goods are licenced instead of sold. In contrast to the industrial era, 
property of a patented product is not simply or necessarily transferred - like a 
high-tech microscope. In the information era, only the use of the technology is 
consented — i.e. licensed. The transfer of property is not at the center of 
interest. Important is the control of use. For copyright, contractual clauses 


3 Such as the newly discussed clearing-house mechanism for patented diagnostics; see 
contributions to the Conference “Patents and Public Health”, organised by Overwalle 
under the umbrella of the CIPR, Leuven, Belgium on May 27, 2005, http://www.law. 
kuleuven.ac.be/cir/conference_27may.htm (visited 7/05). 

3! The Supreme Court of the US upheld a decision of the CAFC in Duke University v. 
John Madey which narrowly interpreted the experimental use exemption as not covering 
academic non-commercial use per se; for a commentary see Eisenberg (2003). 

32 To be clear: These do not necessarily depend on the previous patent. 
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allow the restriction of duplication? In science, these contracts not only 
include use restrictions which evidently impede scientific freedom™, they also 
promote the stacking of royalties. 


2.2.2 Information contracts in science 

The public debate about “reach through contracts” as a problem for scientific 
research was first lauched by an expert advisory committee of the US 
National Institutes of Health in 1998 (National Institutes of Health (NIH) 
1998). It was embedded in the broad discussion about research tool patenting. 
This committee was the first to frame it as a problem for scientists and labeled 
it “royalty stacking”: When scientists do research, they depend on a variety of 
research tools (material, methodologies, know-how) which need to be 
licensed. However, in contrast to industry, additional drivers are in place in 
science when stipulating the contract fostering the accumulation of royalty 
promises: 

When negotiating a license, the typical remuneration are royalties. In 
principle, royalties are in the interest of both parties. The uncertain value of 
the information good is captured by a percentage of profits earned later in the 
development instead of a fixed price. Payment is postponed until the com- 
mercial value materialises. The licensee does not have to procure money 
immediately. The licensor hopes that the share in profits will be higher than an 
actual payment. 

The effects of these basic principles are reinforced in the scientific envi- 
ronment. For the licensor of a patented research tool, science is the only 
market and the only source of income. Research tools do not usually give rise 
to “dependency” of subsequent patents because mostly they enable research 
but do not necessarily form part of the subsequent invention.°° Therefore, as 
the chances of future proprietary profit participation are small, the immediate 
selling prize must be high — but this high price is difficult to realize. In fact, at 
this early stage the value often seems to be low — a point in favor of royalties. 
Also, the licensee will normally not be the one to develop the final product 
ready to be commercialized. Therefore, it is in the interest of the licensor to 
secure some profit from the value enhancing chain by “reaching through” the 
contract. The license permits the broadening of the group of people obligated 
to the original licensor. The contract can not only obligate the licensee to pay 


3 This issue has been intensively discussed as a problem of private legislation 
undercutting publically secured access rights, see Reichman and Franklin (1999, 964), 
Samuelson and Opsahl (1999). 

3% This problem was analysed in Godt (2007, Chap. 6). 

35 Type 2 of the three types of cumulativeness of Scotchmer (2004, 144); also coined as 
“stacking licenses”, see Runge (2004, 821). 

3% A big exemption from this rule are gen patents. Both diagnostics and therapeutics 
will typically be dependent on isolation patents. 
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a share of his/her profits made when he/she succeeds in improving, patenting 
and licensing. It can also require him/her to transfer the royalty obligation in 
favor of the old licensor to the next scientists taking up the research.” 
Assuming that a final research result builds on a broad range of “in-licensed” 
technologies (apart from previous dominant technologies), such promises 
accumulate over time. 

For scientists as licensees, the royalty promise is of no concern with regard 
to the problem of the unknown market value of the information good. From 
their perspective, future royalties will not be debited to their current research 
budget, but will be borne by the research institution or future aquirers. 
Therefore, they as well have an incentive to negotiate royalties.” In addition, 
the royalty promise reduces the time investment in negotiations and provides 
them with quick access to the research tool.” 

Consequentually, contractual promisses contribute to royalty stacking. 


2.3 Discussion 


Summing up, with patents being registered in science long before a product 
becomes reality, two mutually reinforcing factors contribute to the risk of 
royalty accumulation, a proprietary and a contractual mechanism. The pro- 
prietary mechanism touches on the sensitive question of the science/market 
distinction that was once captured by the invention/discovery distinction. 
Academically new and challenging, however, is the contractual mechanism. 
This reason for royalty accumulation deserves more attention. Up to now, 
patent lawyers and economists have focused on the exclusionary function of 
property rights and on contracts only as far as the concern the right to exclude. 
The tectonic shift from sales to lease in information goods has as yet attracted 
little theoretical analysis.“ 

Under both mechanisms research patents run the risk of accumulating 
royalty promises before they are finally ready to be commercialised (“royalty 
stacking”). Thus, the causal link between patents and technology transfer is 
not as compelling as is often claimed. Patents are one, but not the only con- 
dition for technology transfer to happen. Industry will not be interested in 
aquiring research patents if substantial profit shares have already been 
assigned to others. Therefore, stacked royalties ultimately threaten the 
transfer of (patented) knowledge from science to industry. 


3” Type 3 of the three types of cumulativeness of Scotchmer (2004, 145). 

38 Not taking into account institutional long-term interests (like the problem of 
stacking royalties). 

* Patience is a decisive factor that influences the “efficient” prize, see Güth, Kröger 
and Normann (2004). 

“ For a first account see Godt (2007). 


162 Christine Godt 


3 Technology transfer in the context of the information society 


Before addressing policies of how to deal with the stacking of royalties, a brief 
historical note seems appropriate. The shift of paradigms in research policies 
came about in the 1980’s. In the late 1970s, policy makers had identified a 
slowing down of innovation in Western economies whereas global techno- 
logical change was accelerating. Thus, they turned to intellectual property as a 
classical incentive for innovation and strove for reform, both in the US and in 
Europe. In the US, the initial idea was to strengthen small and medium sized 
companies. This was the approach of the celebrious Bayh-Dole Act of 1980. 
The Act transferred the property of patents resulting from governmentally 
sponsored reseach to the inventor. Prior to this, those inventions had generally 
been assigned to the government. However, it came as a surprise that it was 
the universities and research institutions which primarily profited from the 
Act. By patenting, they attracted large amounts of investments, gave spin-offs 
an economic base to start with, and thus not only nurtured, but provided the 
emerging New Economy with the essential knowledge base. Shortly after its 
first enactment, the Bayh-Dole Act was adapted to this realization.*' Even if 
initial expectations of high revenue only materialised for few universities, the 
activities of the newly established technology transfer offices strengthened the 
regional knowledge base of the economy and the reputation of research 
institutions. 

In Europe, the process developed differently. Although driven by the same 
concern, the legal set-up was fundamentally different. Legally, patents were 
always assigned to the inventor. In universities, the so-called “Professor’s 
Privilege” safeguarded the inventor’s ownership of the invention as part of the 
academic freedom.” Public laws provided for equitable licences granted to 
everybody when an invention was publicly funded. This mandatory require- 
ment came under pressure, first inside the EU member states,” later in EU 
research policies. Publicly funded research results were diagnosed as not 
being turned into “useful products”, and the mentioned restrictions on the 
exclusivity of property rights were identified as the reason (Ullrich 1997). 
By now, public access rights have been either abolished or relegated to 
administrative regulations. The owner only has the obligation to use the 


41 A short history of the Bayh-Dole Act is provided by Eisenberg (1996b). 

2 Formerly Art.42 German Employee Inventions Act (Arbeitnehmererfindungs- 
gesetz, ArbDNERfG). 

# See for Germany the advice of the expert group to the Ministry of Science and 
Technology, Ullrich (1997). 

4 6' EU Framework Programme, Art. 23 Reg. (EC) No. 2321/2002, Off. J. L 355/23. 

4 E.g. No. 8.1 Internal Regulations of the German Ministry for Education and 
Research (“Besondere Nebenbestimmungen für Zuwendungen auf Ausgabenbasis”) 
(funding for public research institutions), BNBest-BMBF Juni 2002): Free access has to 
be provided for other academic research institutions. 
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results. Patent owners have almost unrestricted power of their intellectual 
property rights and are even allowed to license them exclusively. Also, the 
“Professor’s Privilege” has been abolished in major EU countries.” Like any 
other employer, the university can claim the intellectual property right with 
due compensation to the personnel.** This reform provided the technology 
offices with the proper base for professional management of the universities’ 
patent portfolios. Thus, in contrast to the US and in contrast to popular policy 
perception,” the patent was not deployed in its classical way as an initial 
incentive to invent. The fact that universities come up with innovative ideas is 
taken for granted.’ The regulatory core idea was that scientific research 
patents would instigate technology transfer from research institutions to 
industry because the knowledge is proprietarily secured. Thereby, the design 
of scientific research became less geared towards questions valued by the 
epistemic scientific community but more towards industrial interests. This 
redefinition of science policies became known as a paradigm shift from sci- 
ence being a “push partner for industry” to industry becoming a “pull partner 
for science”.*! In other words, it turns the old perspective of science as 
“producer driven” vis-a-vis the consumers (the colleagues)? towards a closer 
science/industry relation. These motivations of industry and economic policy 
makers coincided with expectations of policy makers and scientists alike that 
research institutions could do both, attract additional private funding for 
research prior to an invention and, after the invention is made, could sell their 
research results, thus contributing to their funding themselves. Although these 
expectations have not materialised (not for most US universities, even less in 
the EU), the effects to improve the knowledge base of the overall economy 


4 For the EC: Art. 23 Reg. (EC) No. 2321/2002, Off. J. L 355/23; for Germany: Nr. 4. 2 
BNBest-BMBF June 2002 (ibid); German Research Foundation (DFG): No 13 and 14 
“Verwendungsrichtlinie Sachbeihilfe; Vordruck 2.02”. 

47 European Commission — Expert Group (2004, 15). In Germany “Gesetz zur 
Änderung des Gesetzes über Arbeitnehmererfindungen vom 18. Januar 2002”, in force 
since 2 July 2002, BGBl. Part 1/2002, p. 414. (Jurisdictions that still adhere to the Pro- 
fessor’s Privilege are Finland, Sweden, Norway, and recently installed by Italy). 

48 Although some restrictions apply: e.g. the academic scientist retains the right to 
publish freely (§ 42 sec. 1 ArbNErfG). 

® Portraying patents also in the academic sphere as behavioral incentives to invent. 

5 The driving force for academic innovation has been attributed to the scientific norm 
of esteem in the scientific community, first described in depth by Merton (1938/1973, 
1942/1973). 

>! In the EU lauched with the 5" Framework Programme in 1998; in the US through 
developments instigated by the Bayh-Dole Act 1980, see Godt (2007, Chap. 3); Mowery 
and Sampat (2005, 224ff). 

5 For an economic behavioral analysis of this relation see Albert (2006). 
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have been acknowledged. A cooperative system between science and industry 
has emerged.*? 

From the patent systems’ and the behavioural perspective, the key question 
is whether innovation has become causally stimulated by these reforms fos- 
tering technology transfer. As far as preliminary results go, the evidence 
seems to be mixed. There are other factors that influence the cooperation 
between science and industry as much as the availablity of patent protection. 
Beyond institutional and intrafirm organisational arrangements (Owen-Smith 
and Powell 2001, Bercovitz et al. 2001), there are other legal aspects that 
foster or impede technology transfer. For instance, in contrast to the US, 
European provisions on joint ownership do not allow one-sided licensing 
without the consent of all co-owners, thus slowing down technology transfer 
(European Commission — Expert Group 2004, 16-17). Property laws in 
Europe are fragmented. Technology Transfer Offices are still in the process of 
being built up. Also, the majority of scientists still adhere to classical research 
norms like instant publishing and cooperative exchange. Both are potentially 
detrimental to the claim of patents. Where an adaptation to financial incen- 
tives in science has occurred, the repercussions of patents on research™ as well 
as the repercussions of scientific patenting on the patent system itself (Nelson 
2000) have been criticised. 

Therefore, it is safe to say that the “problematique” of “royalty stacking” is 
one facet of the changing environment of the science/industry interface. 
However, if there is neither technology transfer, nor financial gain for the 
research institutions, then the suspension of classical research norms cannot 
be justified. The phenomenon of “royalty stacking” re-traces the profound 
structural differences of research in academic and industrial settings. It points 
at problems that were formerly delt with by the exclusion of “discoveries” and 
“theories” from the patent system. Those problems re-surface and are rein- 
forced by contractual “reach through” arrangements. Stacked royalties 
undermine both, the policy of why the patents were installed in the realm of 
science, and the traditional norms of science (as described by Robert Merton). 
Impeding both patent mechanisms and mechanisms of science will hamper the 
overall pace of innovation in the long run. 

However, it is illusionary to expect that the former invention/discovery 
distinction can be reinstalled. The convergence is due to the fading distinction 
between basic science and applied science that is part of the information 
society. Therefore, other policies must be devised to deal with occurring 
problems. 


5S Coined by the EU as “innovation system”, European Commission — Expert Group 
(2004, 32). 
> See only critics like v. d. Belt (2004) and Krimsky (1999). 
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4 How to catch the beast? 


How can the various actors deal with the problem of stacking royalties? In the 
following, the capacities of industry (1), research institutions (2) and gov- 
ernmental public policy (3) will be considered. 


4.1 Industry 


Asa first reaction, industry could consider the acquisition of research results 
early in the process. However, this motion contradicts contemporary indus- 
trial philosophy to reduce R&D costs by acquiring research results at a fixed 
price later in the process when commerciability becomes a probable option. 

Therefore, strategies must be more effectively geared towards avoiding 
royalty stacking in scientific institutions. A first step, especially for IP managers 
in industry and lawyers in private practice negotiating these contracts, is to 
understand the functional differences of how research results emerge in a public 
and in a private research setting. Although the difference between basic science 
and applied science in respect of marketability has largely vanished, the process 
of how research results are produced is still different. This realization should 
caution against the transposition of contract clauses that may be common to 
industry, but may have different effects and be ultimately detrimental in sci- 
ence. Whereas industry has its own ways of dealing with burgeoning patents and 
licenses (mergers and acquisition, closed or open patent pools) (Scotchmer 
2004, 157), science is not in the position to apply these strategies. 

A starting point for industry involves two aspects. On the one hand, it can 
acknowledge that proprietarily secured technology transfer is perceived as 
socially valuable by both public policy and research institutions. On the other 
hand, it should understand that the dichotomy of the private and public 
research realm is ultimately favorable to economic evolution. Taking both into 
consideration, industry has at least two options to prevent the accumulation of 
royalties in research institutions. First, it can refrain from negotiating royalties. 
This seems to be a cooperative (information) problem inside industry that 
needs to be resolved. Any licensor of a research tool has an interest in 
negotiating as high a percentage as possible irrespective of the danger that the 
profit margin is used up before any end product has reached the market. The 
bottom line is, however, that everybody loses out because no product at all 
will be developed. This consideration might induce industrial associations to 
draw up a code of conduct aimed at reducing use restrictions and favoring 
one-time payments instead of royalties when licensing research tools to public 
research institutions. Second, industry can finance research tools, promote 
their pooling and open access, either by putting them into the public domain 
or by pooling them via “one-stop” (clearinghouse) arrangements. 
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4.2 Research institutions 


The most eminent goal for research institutions is to formulate a patent 
strategy that articulates the profile of the research institution and adopts 
corresponding rules. These policies will position the institution somewhere on 
the line between a merely publicly funded institution driven by research 
interests formerly labeled as basic science (with no obvious commerciability) 
and an applied science institution aiming at revenue generated by the sale of 
research results to industry. Such policies will include the duties and freedoms 
of scientists, principles of their remuneration and publication rules* (espe- 
cially rules on publication if research is funded directly by private companies). 

These policies translate into patent policies: If a research institution aims at 
being a basic science institution, not interested in technology transfer, then it 
should be easy to convince a licensor of patent tools to sell a tool instead of 
licensing it. This strategy can be complemented by the recommendations of 
the Dutch Advisory Council for Science and Technology Policy (AWT) which 
advises research institutions not to patent very basic and broad inventions 
(Dutch Advisory Council for Science and Technology Policy (AWT) 2001). 
From the perspective of the licensor, the revenue in these institutions is 
uncertain anyway. This might help institutions such as Max Planck Institutes 
to avoid royalty promises altogether. On the other hand, for institutions 
working very closely with industry, royalty promises will be unproblematic. 
Industry is used to the royalty quarrels. The challenge lies with the “middle 
range” institutions, i.e. most universities. They have to devise procedural 
strategies to avoid royalties as far as possible. One policy principle might be to 
oblige their researchers to avoid royalties by first trying to buy the tool. If this 
is economically unreasonable, they must negotiate the smallest possible roy- 
alty. Also, a form of recordkeeping needs to be installed, in order to stay 
below the 20% margin that impedes later commercialisation.” 


4.3. Government public policy 


Stacking Royalties has to do with the newly emerging commodification of 
information, with the patenting of research tools and “reach through” con- 
tracts. Governments should approach the emerging problems more coura- 
geously. Mechanisms need to be devised for the financing of research tools. 
Administrations can pool them, provide public access, or help industry to find 


5 In respect to clauses relating to publication freedoms, a variety of model contracts 
are already available, an overview is provided by Peter and Runge (2004). 
5 The record is also important for use restrictions. 
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“one-stop” solutions, devise policies promoting free access of non-commercial 
research institutions to research tools. 

One important instrument is the regulation of public funding. The licensing 
of research tools can be limited by obliging recipients of public funding to 
provide free access to emanating research results. Here, more economic 
research needs to be done.” 


5 Conclusion 


The phenomenon of “royalty stacking” threatens the very goal of technology 
transfer from science to industry. In this respect, it is a challenge to research 
policy. It is a result of two distinct mechanisms, one proprietary, the other 
contractual. The proprietary mechanism is rooted in the expansion of patents 
into areas traditionally defined as “discovery” or “theory” and formerly 
excluded from the patent system. The contractual mechanism is primarily due 
to the transition from sale contracts to lease contracts in the user market. In 
combination, these two mechanisms can have detrimental effects on the 
transfer of technology from science to industry when the royalty share 
becomes “too large”. Two lessons can be learnt: First, the claim of patents 
does not per se secure the transfer of knowledge. A patent is only a conditio 
sine qua non, but other conditions have to be met as well. Second, the phe- 
nomenon of “stacking royalties” sheds light on the diverse nature of the sci- 
entific process. There are areas which are suited to commercialization, there 
are others which are not. The latter seem insusceptible to market mechanisms. 
Patenting in the field of basic science which was formerly classified as a 
market failure (justifying public funding) gives rise to problems that were 
once delt with by its exclusion from the patent system. With the fading dis- 
tinction between basic and applied science, new mechanisms have to be 
devised in order to conserve scientific norms if science is to continue to serve 
as an incubator for “fresh knowledge”. 

Thus, the phenomenon of “stacking royalties” helps to understand changes 
and continuities in science. Even if the concept of science and the market as 
opposites seems outmoded, differences persist. Science as a system has 
become diverse, integrating areas which can be modeled on market mecha- 
nisms. Other areas continue to function differently. These differences must be 
taken into account if research policies want to exploit the potential of both 
realms, the realm of “intentionless” science with long lasting processes and the 
realm of science with high susceptibility for economic innovation. 


5 See Scotchmer’s (2004, 152) idea of research exemptions counterintuitively 
favouring the pioneer. 
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Royalty Stacking: A Problem, but Why? 


Comment by 


CHRISTIAN KOBOLDT 


Christine Godt’s paper in this volume describes a problem that, in her words, 
“threatens the very idea of technology transfer from science to industry”. 
Because scientific research increasingly relies on a multitude of inputs that are 
protected by patents, and because the holders of these patents individually 
wish to negotiate royalties which link their remuneration to the commercial 
value of the output of such research, there is a risk that, by the time research 
results can be commercially exploited, the accumulated royalties have 
reduced the potential margin to such an extent that the investment that would 
be needed for successful cross-over has become unattractive.! Put differently, 
royalty stacking is a problem because it leads to research outputs being so 
encumbered with royalty promises that they become commercially unat- 
tractive and will not be exploited. The paper deals with this problem mainly 
from a legal perspective, discussing it in the context of the general feature of 
patent dependency, which supports incremental innovation, but which also 
creates the risk of accumulated royalties that may eventually stymie com- 
mercial success. 

In this comment, I will try to draw out more clearly the economic issues that 
are of interest with regard to the problem of royalty stacking. In particular, I 
will address the question whether accumulated royalties may indeed exceed 
the level that would be optimal for all the parties involved in cumulative 
research (which, in turn, may or may not be socially optimal — a question 
which I do not discuss), and if this is the case, why patent holders do not solve 
this problem by using different terms in licensing their intellectual property. 
Given the scope of this comment, I will raise mainly questions rather than 
provide answers, but I hope that this (unencumbered by royalties) will lead to 
further research. I will begin by assuming that patent holders wish to negotiate 
royalties in order to examine the impact on the overall royalty level, and then 
discuss whether this assumption is justified. 

From an economic perspective, royalty stacking seems to be a clear problem 
of externalities. By trying to increase their individual share of the prospective 
cake, patent holders put at risk the commercial success of the research project 


! Tt is worth pointing out that the problem is not limited to technology transfer, but 
could arise with regard to any investment made in advancing a particular research 
project. Accumulated royalties affect the likely return that such an investment can be 
expected to earn, and thus royalty stacking may lead to research projects being aban- 
doned at any stage whenever the burden of accumulated royalties has reduced the 
expected return to a prohibitive level. 
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— without which there is no cake that could be shared. The basic mechanism is 
easy to demonstrate. 

Assume there are n patent holders whose intellectual property is a neces- 
sary input into a research project. The project, if successful, generates a value 
of 1. For the sake of simplicity, assume further that all of these licensors are 
identical in terms of the contribution that their patents make to the success of 
the research project, and with regard to their requirement for compensation, 
and that licensors make a take-it-or-leave it offer, i.e. set the royalty level.? 
Consider that the probability p of the research project being completed and 
successfully exploited decreases with the level of accumulated royalties 
r=} r, where r; denotes the royalty negotiated by licensor i. Thus, the 
expected value of the research project is p(r) with p’ <0. For simplicity, assume 
that the probability decreases linearly with the accumulated royalty level, i.e. 
p(r)=1-r. 

Individually, each licensor wishes to maximise its share of the commercial 
value, i.e. set 7; so as to maximise r,(p— ),)r;). Using the assumption of 
symmetry of licensors, solving the first order conditions in a model of 
simultaneous royalty setting? gives an individually optimal royalty level of 
r=1/(n+1) for all i=1..n. Cumulative royalties are therefore 
r*=n/(n +1). Whenever there are multiple licensors, this is in excess of the 
level of cumulative royalties 7 that would be optimal from the perspective of 
all licensors together. The collectively optimal royalty level is obtained by 
maximising r(1 — r), which gives # = 1/2. Unsurprisingly, the problem is worse 
the larger the number of licensors — and it also is worse (and progressively so) 
if royalties are being set sequentially.‘ 

Thus, it is indeed the case that individual attempts to maximise royalty 
revenues lead to a collectively sub-optimal outcome. Each licensor enjoys the 
full benefits from increasing its royalty level, whereas the negative impact on 
the probability of the research project’s success is socialised. This would 
suggest that collective negotiations of royalties (or the use of one-stop 
clearinghouse arrangements as suggested by Godt) is one way to reduce the 
problem, although not one that is guaranteed to succeed. As we know from 
the economic literature, the existence of gains from co-operation is by no 
means sufficient for co-operation to succeed. 


? This assumption obviously has a bearing on the level of royalties, but not on the 
general result that individually set royalties are cumulatively higher than collectively 
negotiated ones. 

3 Royalties are being set simultaneously in the sense that no individual licensor pos- 
sesses information about the royalty levels set by the other licensors. 

4 Where the k-th licensor sets its royalty level knowing the royalties set by all licensors 
j<k, and anticipating the royalties set by licensors />k, it is easy to show that the 
optimal royalty for the k-th licensor is r4= 1/2*, and therefore the cumulative royalty 
level is 1 — 1/2”. It pays to be first in the licensing queue. 
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Having a clear view of the problem associated with individually negotiated 
royalties, however, puts the focus on the second part of the question. Why, if 
the encumbrance of profit opportunities through (accumulated) royalties 
reduces the likelihood of commercial success, are licensors interested in 
royalty arrangements in the first place? Would it not be better to negotiate 
other forms of remuneration, such as an outright sale of, or a fixed licence fee 
for the use of intellectual property? Indeed, it is easy to demonstrate that both 
the licensee and the licensor could benefit from negotiating a fixed licence fee 
whenever royalty promises reduce the probability of successful commercial 
exploitation of research. 

Assume again that a research project generates a value of 1 if its results can 
be successfully exploited commercially, and that the probability of success 
depends on the level of (accumulated) royalties. The licensee would prefer to 
pay a royalty rather than a fixed licence fee / for an input protected by a 
patent if (1-r)p(r) > p(0) -1, i.e. if the licence fee is / > [p(0) — p(r)]+ 
r(p(r)). As the second term on the right hand side of this equation gives the 
expected payment to the licensor under a royalty arrangement, this implies 
that as long as p(0) > p(r), for any royalty level there must exist a licence fee 
which is preferred by both the licensee and the licensor. In very simple terms, 
if a royalty arrangement reduces the chance of success, the gains from 
avoiding the encumbrance with royalty promises can be split between the 
licensee and the licensor through a fixed licence fee instead of a royalty. This 
logic applies to multiple licensors, and suggests that negotiating a licence fee is 
better for each individual licensor and the licensee regardless of what the 
other licensees have done, or will do. 

Again, the existence of such gains from co-operation is not a sufficient 
condition for co-operation to occur, but the fact that patent holders and 
research institution can do better by agreeing to a fixed licence fee rather than 
a royalty arrangement in cases where encumbrance with royalty promises 
reduces the potential commercial value of the research raises the question 
why royalty arrangements are observed in such a context. It certainly suggests 
that further analysis would be required with regard to the claim by Godt that, 
in principle, royalties are in the interest of both parties. 

In this regard, it is worth pointing out that the fact that research increasingly 
relies on intellectual property rather than physical capital or labour is not, in 
itself, a reason for the use of royalties. Although it is true that it may not at all 
be in the patent holder’s interest to ‘sell’ the patent rights, there should be 
nothing stopping her from licensing her intellectual property for a fixed fee 
that is payable irrespective of whether or not the research ultimately has 
commercial success. Thus, we must look elsewhere for an explanation as to 
why royalty agreements are entered into even in cases where they reduce the 
expected value of the research project (and in particular where the problem 
may be exacerbated through royalty stacking). 
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From an economic perspective, a number of potential explanations spring to 
mind: 

— There could be a problem of asymmetric information. If the licensee has 
better information about the likely commercial value of the project, licensors 
may fear that they do not receive their fair share when negotiating a fixed 
licence fee. The licensee may understate the likelihood of success or the 
commercial value of the project, and rather than agreeing to a fixed licence 
fee based on the value predicted on the basis of information provided by the 
licensee, the patent holder may wish to negotiate a royalty which links pay- 
ments to the value that will actually be realised. 

— Licensors may also simply be myopic or unaware of the detrimental 
impact that a royalty arrangement can have on the likelihood of commercial 
success, although in this case the licensee should find it easy to alert licensors 
to the potential downside. 

— Conversely, there may be principal-agent issues involved on the part of 
the licensee in the sense that those agreeing to royalty payments (e.g. 
indvidual researchers) are not the residual claimants of the commercial value 
of the research. As Godt points out, they are not the ones who pay future 
royalties, whereas fixed payments now would have to come out of their 
research budgets. Thus, researchers motivated by scientific rather than com- 
mercial success may well prefer to agree to royalties, as this may allow them to 
achieve their objective at a lower cost, even though the overall impact is 
detrimental. 

— Licensee and licensor may have different preferences with regard to the 
risk associated with the research project, or may have different discount rates 
and therefore attach different relative weights to future royalty payments/ 
receipts and current fixed payments/receipts. 

— There may be capital market imperfections that limit the ability of the 
licensee to fund current payments of a fixed licence fee against future rev- 
enues from successful commercialisation. 

— There may be other constraints in play which affect the choice of 
licensing arrangements. For example, it may be the case that royalty agree- 
ments perform better in some cases, and that licensors may be worried about 
allegations of discriminatory treatment that might lead to complaints or pri- 
vate litigation under competition law. 


5 This would obviously be an issue only in cases where the licensor may be deemed to 
have market power. It is worth pointing out that the reason why licensors may prefer 
royalties to fixed payments in other settings may have to do with the effect on com- 
petition of cross-licensing arrangements where licensor and licensee compete in a 
downstream market. In such cases, royalty payments have the effect of profit-sharing 
arrangements, which may affect the intensity of competition downstream. 
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It would certainly be worthwhile to investigate in greater detail which of these 
reasons is mainly responsible for the choice of royalties over fixed licence 
payments, and to what extent their relative importance varies between 
‘industry’ and ‘science’. 

This would not only provide a better insight into the differences between 
these two areas, and suggest why the problem of royalty stacking is one that 
particularly affects licensing of intellectual property in the context of scientific 
research. More importantly, a better understanding of the underlying causes 
for using royalty agreements in cases where they reduce the expected value of 
research is indispensable for identifying potential solutions to the problem of 
royalty stacking. A ‘simply say no’ approach to royalty arrangements, the 
promotion of collective negotiations, or a cross-over to industry before the 
royalty burden becomes too large, as suggested by Godt, may not be the only 
responses, and may not necessarily be the best ones. 


An Economic Theory of Academic Competition: 
Dynamic Incentives and Endogenous Cumulative Advantages 


by 
NICOLAS CARAYOL 


1 Introduction 


The implicit and explicit rules of academic research stress a specific reward 
system in which priority is essential (Merton 1957). The recognition of a 
scholar as the intellectual proprietor of the knowledge she produced increases 
her credit within the peer community. In turn scientific reputation translates 
into increased wages, more prestigious positions and other non-monetary 
rewards (Dasgupta and David 1994). The academic reward system appears to 
be fundamentally reputation-based, which has two main implications on the 
provision of incentives across time (for a given scholar) and across scholars. 
First, reputation-based incentives tend to distort the distribution of incentives 
during researchers’ careers since the returns of research activity are usually 
delayed and spread over the remaining professional cycle. Scientists are thus 
essentially facing dynamic incentives (career concern). Because the expected 
returns of efforts are decreasing with the remaining activity period, the sharp 
decline of incentives in the late career is likely to overbalance the experience 
effect. Thus, an inverse-U shape of scientific production distribution over the 
career cycle is predicted. Several empirical studies using panel data of pub- 
lication profiles have corroborated this statement (Weiss and Lillard 1982, 
Levin and Stephan 1991).1? The second consequence of such a reputation- 
based reward system is that resources and means are not uniformly distributed 
across agents but tend to be concentrated in the hands of those who have 
more credit. Cumulative processes bias the academic competition, providing 
some competitive advantage to the agents who have experienced the best 


! Stephan and Levin (1997) show that the peak is often attained between ages 35 and 
50 in biochemistry and physiology. Diamond (1986) finds that the publication profiles of 
Berkeley University mathematicians decrease over the whole career. This specificity may 
be explained by the low experience effect in this discipline. 

? Such results suggested improvements of human capital theory. Two technical sol- 
utions may be used. The first one consists in introducing human capital depreciation. 
Since it also generates a counter-factual decrease in wages, another solution has been 
suggested by Levin and Stephan (1991), who introduced a “puzzle solving” argument in 
academics objective function, namely, the agents do not value only wages but also sci- 
entific production itself (or get other satisfactions from it but the wage). 
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early career accomplishments.” Analyses of the distribution of publication 
records among scholars‘ and time series analysis (Allison and Stewart 1974)) 
support the cumulative advantage hypothesis. 

This paper presents an original model of academic competition? which en- 
compasses the two stylized facts exposed above. It relies upon the following 
simple mechanism. We suggest that academic positions differ both in their 
associated productivity and in the utility they provide to scholars (both due to 
different wages and other non-monetary satisfactions). At the different stages 
of their career path, agents compete to get the best positions. In equilibrium, 
universities’ hiring and promotion decisions are taken according to the sci- 
entific production (or credit) ranking at the previous stage. Therefore, the 
scientists most productive in their early careers are favored in the next stages. 

Empirical evidence supports the idea that a mechanism of this kind is at 
play in academia. The universities, as well as the research positions they offer, 
are quite heterogeneous in terms of their associated productivity. Stephan 
(1996) argues that prestigious institutions are endowed with heavy instru- 
mentation equipment that less established ones cannot afford. Cole (1970) 
shows that the reputation of the hosting institution generally signals 


° R. K. Merton (1968, 1988) gives the label of Matthew effect to the various cumulative 
advantages affecting the academic sphere. He refers to the quotation of the Gospel 
according to Saint Matthew: “for unto every one that hath shall be given, and he shall 
have abundance: but from him that hath not shall be taken away even that which he 
hath”. The first evidence of cumulative advantages in academia is due to H. Zuckerman’s 
Ph.D. thesis defended in 1965 (with Merton as a supervisor), which was dedicated to 
studying Nobel laureates career paths. In their early work, both Merton and Zuckerman 
tend to limit the application of the notion to the symbolic mechanism according to which 
already reputed scholars gain more credit than less reputed ones from a co-authored 
paper or from a simultaneous discovery. The extension to various cumulative advantages 
comes later (e.g., in Merton 1988). 

4 This distribution is known to be highly skewed: few researchers publish many articles 
and many researchers publishing only a few papers each. The shape of the distribution 
can be well approximated by an inverse power distribution (power law) given by the 
function f(n) = an~‘, with f(n) as the number of authors having published n papers and a 
and k as parameters of the law. When k =2, this expression is identical to the one initially 
proposed by Lotka (1926). Many empirical studies have confirmed the relevance of this 
distribution for different scientific domains, see, e.g., Murphy (1973) for the humanities, 
Radhakrishnan and Kernizan (1979) for computer science, Chung and Cox (1990) for 
finance, Cox and Chung (1991) for economics, Newman (2000) for physics and medicine, 
Barabasi et al. (2001) for mathematics and neuroscience, etc. 

5 Several modelling attempts have considered other dimensions of academia. Car- 
michael (1988) intends to explain the tenure system. Merton and Merton (1989) describe 
the optimal timing scheme for solving a set of scientific problems. Lazear (1997) models 
funding agencies. Brock and Durlauf (1999) propose a model of discrete choice of sci- 
entific theories when agents have an incentive to conform to the opinion of the com- 
munity. Lach and Schankerman (2003) model the licensing of scientific discoveries. 
Carayol and Dalle (2004) propose a model of scientific knowledge accumulation over an 
increasing set of scientific areas. 
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researchers’ abilities, enabling them both to more efficiently raise funds and to 
more quickly and widely diffuse their results in the scientific community. 
Hansen et al. (1978) showed that the quality of the researchers’ universities is 
the critical variable for explaining future production. Cole and Cole (1973) 
found a university department quality effect on the productivity of physicists. 
Some empirical evidence also suggests that the allocation of the best positions, 
either through internal promotions or through hiring decisions (Garner 1979), 
is mainly based on past scientific production. Zivney and Bertin (1992) 
showed that the researchers “tenured” in the twenty five most reputed finance 
departments of US universities previously published perceptibly more than 
the average tenured researcher. Having studied the mobility of more than 
3,800 economists, Ault, Rutman and Stevenson (1978, 1982) showed that the 
main determinant of the quality of the first position hosting institution is the 
quality of the training university (both at undergraduate and graduate levels) 
and the quality of the university where the Ph.D. has been defended. More- 
over, they showed that further “upward mobility” (mobility associated with an 
increase in the quality of the institution) is mainly explained by past pub- 
lications (even if the effect is limited). The most productive agents will benefit 
from better research positions and are in turn likely to publish more. In this 
way, the academic competition is dynamically biased in the sense of Merton’s 
cumulative advantage, because the initial successes tend to further improve 
productivity and, in turn, favor late successes. Thus, in the very nature of the 
academic employment relationship lies one of the sources of the cumulative 
advantage process. 

More precisely, we model two employers (we refer to universities, but it 
could be departments or research labs) which offer at each period research 
positions at all career stages — Ph.D., junior and senior levels — to overlapping 
generations of two researchers. While taking promotion and hiring decisions, 
universities cannot (or just do not) observe both agents’ efforts and cardinal 
values of past productions: such decisions are taken on the basis of agents’ 
past production ranking.° At each stage of the career, positions differ both in 
terms of their remuneration and their associated productivity. There is a 
productive premium due to both, the accumulated reputation of the host 
institution and positive spillovers from first-ranked colleagues of other gen- 
erations within the university. At the junior and senior stages, the previously 
most productive agents will select the positions they prefer, while others will 
accept the remaining academic positions or even choose the outside option 


The structure of the model has much in common with the biased contests literature 
initially applied to sequential auctioning (Laffont and Tirole 1988), imperfect meas- 
urement of agents’ production within firms (Milgrom and Roberts 1988, Prendergast and 
Topel 1996), and, lastly, career paths within firms that are either autoregressive (“late- 
beginner effect”, Chiappori et al. 1999) or dynamically correlated (“fast track”, Meyer 
1991, 1992). 
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(i.e., leave academia). Getting the most productive position improves the 
chances to win the next competition round, that is, to get again more pro- 
ductive positions and higher wages. This is the way cumulative advantage in 
researchers’ competition is captured in the model. In this paper, we also 
explicitly introduce competition between universities, which also compete at 
each period to hire the best researchers at junior and senior stages. Therefore, 
hiring decisions according to agents’ ranking, wages, and cumulative advan- 
tages are endogenously determined in a dynamic setting. 

Our main results are the following. We derive researchers’ equilibrium 
efforts and show that, as highlighted by several empirical contributions 
(Zuckerman and Merton 1972, Allison and Stewart 1974), the anticipated 
cumulative advantage improves early career efforts (because it generates 
dynamic incentives) while it deters late career efforts. However, the effect of 
the cumulative advantage on scientific production over the whole career 
remains ambiguous.’ As regards universities competition, we derive Markov 
perfect equilibria of the game under non-restrictive assumptions. The equili- 
brium is stationary in the long run: a fixed cumulative advantage endoge- 
nously arises that the leading university confers to its researchers. We pre- 
cisely compute the equilibrium and the optimal wages and show that the 
equilibrium wages offered to second-ranked agents are optimal and that the 
wages offered by the leading university to the first-ranked agents may be 
larger or lower than their optimal counterparts. This is because leading uni- 
versities do not internalize the positive incentive effect of the wages they offer 
on scholars hired by other institutions at previous stages. 

The paper is organized as follows. The model is presented in the next sec- 
tion. The third section is dedicated to agents’ behaviors under such a biased 
competition. The fourth section introduces universities competition. The fifth 
section compares optimal to equilibrium wages. The last section concludes. 


2 The model 
2.1 Main features 


Let us define the population of academic researchers as overlapping gen- 
erations of agents whose career lasts three periods. Let p € {1, 2, 3} denote the 
periods at which the agents can be, respectively, Ph.D., junior researchers, and 
senior researchers. At each period ¢ of discrete time, a fixed cohort C' arrives, 


7 In a previous contribution (Carayol 2003), which introduces a more specific model 
(only one cohort), I specifically studied and found an optimal level of cumulative 
advantage, i.e., the second stage competitive bias given to the first stage winner that 
optimally balances early career incentive effect and late career disincentive effect. 
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composed of two researchers. There are two research institutions, say, uni- 
versities {i, j}, which offer a position of each stage. 

The outcome of scholars’ work is assumed to result in some aggregated 
output which can be called research production (that is, papers properly 
weighted to account for quality differences or, equivalently, credit in the peer 
scientific community). Research production is supposed to be additively 
separable in efforts over the two first periods of activity. At period t, the 
research output of the agent employed in university i at stage p of her career is 
given by 


(1) yi = free) + be + e. 


It is a function of effort spent at time t by the agent being at stage p of her 
career e‘, with f? a positive and increasing function whose derivative gives the 
productivity of effort at the different stages of the career (Ph.D., junior and 
senior: p € {1, 2, 3}). P is assumed to be strictly increasing, concave, and null 
when the agents exerts zero efforts: f” > 0, fP" < 0 and f? (0) =0. The term ¢? 
is the specific random shock that affects agent z’s production at stage p, where 
E [e?]=0. Let us assume these shocks are iid across agents and periods. We 
define Ae? as the difference between the individual random shocks at stage p: 
Ae? = e} — £. The distribution function of this random variable is denoted G? 
and its density function is g. The latter is assumed to be unimodal, con- 
tinuously differentiable, strictly positive over [—oo, ©], and symmetric around 
its unique maximum attained at 0 (which implies that g”(0) =0). The term b?‘ 
gives the surplus of credit which is due to the agents’ context of work: it is an 
attribute of the university in which the agent is working. This position specific 
component is formed as follows: 


(2) be = at + Be", 


with a; as the effect of the accumulated reputation of the institution on agents’ 
production. For simplicity, we assume that it is independent of the stage p. The 
vector a‘ = (ai, ai) synthesizes the reputational advantages of the two uni- 
versities at t. The term f?’ is the potential production premium due to the 
previous period ranking of co-workers in the academic institution. It is a 
positive externality which can be seen as a reputation effect. Alternatively it 
might also be thought as a spillover due to costless interactions (like good 
advisers or next-door office colleagues). It can be computed as 


(3) Be! = B x L{ilp |tranks first with p' +p} 


for p,p'=2, 3, with f a positive parameter and 1 {-} the indicator function 
which is equal to 1 if the condition in brackets holds. The expression “iļp'|t 
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ranks first” simply means that university i employs at t a researcher who is at 
stage p’ of her carrier and who was ranked first at the end of the preceding 
period (period ¢— 1 and stage p’ — 1). This assumption formalizes the idea that 
junior and senior scientists gain some production premium f to be working 
with high-ranking scientists of the other generation. Ph.D. students (first 
stage) are assumed to gain equally from the ranks of the two older gen- 
erations: ß}' =1(ß}' + ß}"). The term a! is simply constituted of the accumu- 
lation of the past premia as follows: 


@ a= eve 


p=237=1...F 


with some discount factor y over a given relevant period of time T. If y tends 
to 1, then past and present advantages have nearly equal weights in present 
production. When y tends to 0, then a; also tends to zero and the production 
premium b?‘ tends to be restricted to the present spillover PP". 

Agents’ instantaneous net utility is given by the function W(s?", b?', eP‘), 
which is assumed to be additively separable between disutility of efforts and 
utility as follows: 


(5) WP! = Ust") + p x HBP > BP} - Viel‘) 


where @>0 represents the surplus of satisfaction derived from being in the 
most prestigious institution. Agents value not only wages but also the prestige 
of their host institution. U is an instantaneous utility function that assumes 
agents to be intra-periods risk averse: U : (0,00) — (—%0, 00) such that U' > 0; 


U" <0, lim, U(s) = — oo. The instantaneous disutility of efforts function 
V: (0,00) — (0,00) is assumed to satisfy V(0)=V’(0)=0, V >0, V” >0, and 
lin... V(e)=x. 


The whole career net utility function is assumed to be additively separable 
between the three periods of the career. We also assume that agents do not 
have access to the financial market; so they can neither save nor borrow and, 
thus, consume their whole revenue received at each period. Thus, the total net 
utility of agent i of cohort C’ actualized at the initial period ¢ is given by 


(6) W, = 5 weiter) 


p=1,2,3 


with 6, the agents’ discount factor. 


8 Levin and Stephan (1991) assume that scholars’ objective function has a “puzzle 
solving” argument, that is, scholars also directly like publishing papers. Our assumption is 
slightly different in that we rather assume that scholars like being in a distinguished and 
prestigious institution. 
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2.2 Information and timing 


At each period, universities offer a position at each of the three stages. At the 
first stage, wages are uniform and exogenously fixed and agents are assigned 
to Ph.D. research positions. At the junior and senior stages, the universities 
compete in wage offers. Universities cannot adapt wages to cardinal infor- 
mation on past production: they only compare the scientific production of the 
two candidates at the previous stage.” At both levels, the positions they offer 
have distinct associated productivities. The universities play simultaneously 
and have an infinite time horizon. This game is called the Universities Game. 

Agents have a life cycle point of view and face intra-cohort competition. 
They compete during the first round when they are Ph.D. At the end of the 
first period, they can access junior positions offered by universities. Positions 
are characterized by an associated utility and a production premium. Given 
universities competition, the most productive agent chooses the university 
that offers the preferred junior position. The other agent accepts the 
remaining junior position offered by the other university or defects and takes 
the outside option. If not, the two agents compete in the following stage. 
Again, the most productive chooses the senior position he prefers. There is a 
cumulative advantage because the first winner can choose the junior position 
that provides an advantage to get the best senior position. At the end of the 
third period, they retire. 

Agents competition is analyzed in the following section while section 4 
deals specifically with universities competition. Time consistency between the 
two interrelated competitions is due to the fact that efforts are not observable 
and that cardinal information on production is not available. Previous-stage 
contest ranking is the only information available. Therefore, scholars care 
only about future wages and productive advantages that they consistently 
believe to be stationary. As we shall show, universities care only about pre- 
vious period ranking, the other university’s strategy, and the present efforts of 
its current employees at previous stages when setting their wage offers. 


3 Researcher behavior 


We now concentrate on the computation of scientists’ equilibrium behavior, 
leaving aside the issue of universities competition that will be treated in the 
next section. Important for us now, we shall show there that, at each stage, the 
agent who wins the competition occupies (as expected) at the next period the 


° Thereby, we also assume that universities do not consider the ranking of the Ph.D. 
stage to hire a senior researcher. This information is either neglected, irrelevant, or just 
lost. 
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position that provides the highest satisfaction, which also provides the highest 
productivity. Let the difference in productive advantages b” at each stage p = 
1, 2, 3 be denoted Ab’.'° We shall also define II? as the expected utility of the 
agent employed at university i!! from stage p to the end of the career given its 
level of information so far. Formally, we have 


(7) IE =E > owe, 


q=p 


with E as the expectation operator. 

At each stage, agents maximize expected utility over their remaining career 
cycle. We use standard backward induction reasoning. Since there is no motive 
for any competition in the last stage of the career, we have e} = 0, k = i, j and 
third stage production thus is equal to the shock and the potential production 
premium. !? We thus concentrate on the two first stages of the career: we first 
study the second stage Nash equilibrium and then the first stage subgame 
perfect Nash equilibrium. 


3.1 Second stage Nash equilibrium 


At the second stage, agent i chooses her effort level e? given j’s (e?) in order to 
maximize her expected net utility from then. Agents believe that they will get 
more utility if they win than if they loose such that AU? > 0, a belief which is 
consistent with universities’ behaviors as we shall show in the next section. 
Thus, i’s program at the beginning of stage 2 consists in 


a 
e; = argmax 


{p(y > yi) xò (T + o) & [1 = p(y > x)| x òU -V(e)N. 


T is the utility derived from the wage if the second contest is won and U’ if it 
is lost. The Nash equilibrium effort level (ë?) maximizes the expected net 
utility actualized at the second career stage. It is equal to the probability of 
winning the second contest times the utility he will receive if he wins the 
contest, plus the probability to loose that contest times the utility received if 
he looses it, net of the disutility of efforts.’ An identical program could be 


(8) 


10 Since the analysis in this section will be limited to only one cohort’s behavior, we 
will remove time superscript and uniquely refer to career periods (given by p). 

1! In this section “agent i” and “the agent employed by university 7” have the same 
meaning. 

12 Clearly, senior researchers do exert efforts in real life. Nevertheless, this behavior 
seems not to respond to the kind of motives that are considered in this model. 

13 We omit second stage utility since it is independent of second stage efforts. 
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written for j. The simultaneous solution of the two programs, detailed in the 
appendix, leads to a unique Nash equilibrium which is symmetric and given by 


(9) & = 8, (6,(AU? + p)P(Ab?)), 


with function ©, defined as ©, (x) = V’ (x)/f? (x). This function is null at 0 and 
strictly increasing from then (and, thus, so is its inverse function ©;'). 
Therefore, second period efforts are increasing with agents’ actualization 
factor, and the differences in third stage differences in utility whatever they 
come from, difference in wages (AU?) or in satisfaction derived from being in 
a prestigious institution (p). The cumulative advantage (Ab?) effects are 
negative (for details, see the appendix). 


3.2 First stage subgame perfect Nash equilibrium 


We now turn to agents’ first period behaviors. Agent i’s objective is to 
determine her first period effort level in order to maximize expected net 
utility, that is, 


(10) & = arg max{TI; }. 


This maximization program is different from the second period one since the 
first period success influences the second stage competition. The unique and 
symmetric subgame perfect Nash equilibrium (again, detailed computations 
are in the appendix) is given by 


(11) č! = OT (g'(Ab!)ô,[AU? + @ + 6,(2G?(Ab?) — 1)(AU? + p)]) 


with ©, defined as ©, (x)= V' (x)/f (x) and @7' increasing. Moreover, since 
Ab? >0 and g? is symmetric around its unique extremum at 0, G? (Ab?) > 1/2 
and, thus, 2G? (Ab?)—-1>0. Thus, first stage equilibrium efforts é' are 
increasing with the discount factor and with the differences in utilities at the 
junior and the senior stages (AU? and AU?) whatever they come from, the 
differences in wages or the differences in satisfaction to be in the most 
prestigious institution. The cumulative advantage effects (Ab?) are positive 
while the effects of the initial advantage (Ab!) are negative (for details, see the 
appendix). 

These results are summed up in proposition 1 below. It confirms the results 
of several empirical studies according to which the cumulative advantage 
stimulates early career efforts while it diminishes late career efforts. 
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Proposition 1. Agents’ equilibrium efforts are unique and symmetric at each 
stage of the career cycle. Equilibrium efforts decrease with the differences in 
utility provided by positions at the remaining stages of the career. The cumu- 
lative advantage which favors in the second stage the winner of the first 
competition increases first period equilibrium efforts while it decreases the 
second period efforts. The first stage advantage decreases first period efforts. 


4 Universities competition 


In this section, we shall focus on the Universities Game. At the end of each 
period, the scholars’ ranking is public knowledge. At the beginning of the next 
period, the universities offer one position at each stage p=1, 2, 3. To fill their 
available junior and senior positions (agents are assigned exogenously to 
Ph.D. positions at the beginning of the career), universities compete to hire 
the best researchers. Neither do the universities care about the institutions 
where the researchers were previously employed, nor is relevant cardinal 
information on past production available (or relevant with regard to the 
institutional constraints). The universities compete simultaneously in wages. 
The competition is asymmetric with respect to their respective reputations. 
Universities consider the game as lasting infinitely. 

Let us denote by Qj the objective function of university i at some time 
period fọ. Without loss of generality, we will consider 4 =0 in order to avoid 
cumbersome notation. The objective of university i is its discounted net sur- 
plus over an infinite period of time: 


(12) Q; = E Syy" 8) 


The parameter y > 0 gives the per-unit value of scientific knowledge captured 
or just considered“ by the university. It is assumed to be homogeneous or 
normalized. Parameter ô is the discount factor. 

Let now Ay?" denote the net surplus of production university i gets from 
employing at stage p and period t the scientist who won her preceding aca- 
demic contest (as compared to hiring the one who did not). It directly derives 
from the preceding section that this surplus is only composed of the direct 
productive complement a first-ranked agent provides to other researchers 
employed by i (at other stages). It is independent of i and can directly be 


14 If the university is controlled by any external institution or body having its own 
goals (e.g., a state), the rate ip in the objective function might be higher than the effective 
rate of returns of scientific knowledge on the university budget. It would become closer 
to its social value. For a comparison between ip and the real social value of scientific 
knowledge, see section 5. 
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computed from agents’ production function (1) and from the equation which 
sets the direct externality: 


(13) Ay = Ay? = 38. 


We denote by y>! the expected production only due an agent’s efforts 
(without considering the premium). It is only affected by her equilibrium 
efforts (which are unique at each stage as shown in section 3). 

The universities’ payoffs at period ¢ can be written as a function of the 
wages offered by the two universities at the two last stages summed up in 
vector s! = (Ski jp-23' 


a mils) =. (me =") x af > We) 


WEP UP > Te and Te > Wr) 


University i must offer agents a wage that provides a higher expected utility 
than their reserve utility level outside the university system (II? < w). 
Otherwise, agents always defect and the university gets a null payoff. The 
second component of the right hand side of the equation indicates that, at 
each stage p=2, 3, if university i provides the highest expected utility given 
the level of information of agents so far (II? > II} ), it hires the researcher 
ranked first and captures the surplus of production as given in (13). Otherwise, 
the university hires the other researcher and cannot capture the surplus in 
revenues associated with the production premia given in (13). 

At each period, an action of university k =i, , j in the Universities Game is a 
vector si, = (sr i =) of the two wages offered at t. The history of the game so 
far, denoted h' € H', where H' is the set of all possible histories in period t, is 
the collection of all previous actions such that h' = h! U (s‘'). H is the set of 
all possible histories over time, that is, H = U%,H'. A pure behavioral strategy 
at t is a function pl: H' x R? — R*, which gives a couple of wages at t (an 
action s‘) for each possible history at t and each possible wage simultaneously 
offered by the other university (si). 

We are interested in Markov Perfect Equilibria (MPE, see Maskin and Ti- 
role 2001) of the Universities Game. So we restrict considerations to Markov 
strategies, that is, strategies that are not functions of the whole history of the 
game but only of the state of the system. The vector a’ € R*”, which syn- 
thesizes the reputational premia of the two universities at t, is the payoff- 
relevant state of the system because no other past variable does consistently 
influence present actions. We consider only strategies of the form 
d! : R? x R? — R”. Such strategies give, for each action of the opponent 
and for each given reputation levels, a couple of wages s‘. They are of the form 
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0; = 01(si,a'). An MPE is a couple of strategies ô' = (6/, 6/) such that 0; and 6; 
are best responses to each other and are Markov strategies. 

The MPE notion relies much on the idea that small causes have minor 
effects. In the dynamic programming spirit, agents are assumed to simulta- 
neously maximize their continuation equations which in the context of the 
Universities Game can be set as follows: 


(15) o(s, a) = N, (uye = a x um > we 
+334 Bx um >I and I? > wel. 


This expression is similar to the payoff function (14), except that now hiring 
the first-ranked scientist at the junior and senior levels brings some delayed 
productive surplus due to the increase of reputation to the host university it 
causes. The increment of reputation depreciates over years at rate y and is 
discounted by factor 6 over an infinite horizon. The payoff surplus is thus 
By Ya 0'y'. When one adds to this term the direct spillovers 3ßy already 
present in (14), it becomes equal to B = 38y Di, 6'2' = Byl — dy) | 

The simultaneous maximization, at each period, of the two universities’ 
continuation equations leads to the MPE. The intuitions for the equilibria are 
the following. By convention, and without loss of generality, let i be the 
leading university at the period considered, t, while j is the other university, 
that is: a; > aj. University i can attract the best scientist with a lower salary at 
stages 2 and 3 because agents value being in the most reputed university and, 
if they are about to enter the junior stage of the career, they also know that 
working in this university will increase their probability to win the next con- 
test. Since both institutions value equally recruiting the best researcher, such 
asymmetry provides university i a decisive advantage. Indeed, university i can 
offer a wage such that university j cannot set a wage that might attract a first- 
ranked researcher without having a lower return than when just hiring the 
second-ranked agent. The best rate at which university j can attract the sec- 
ond-ranked agent is by setting the minimal wage which saturates agents’ 
participation constraints. Theorem 2 states this more rigorously. 


Theorem 2. Given assumptions 3, at any period t>0 of the Universities 
Game, the MPE © = (65,65) is such that, if a; > ai, the equilibrium wages s' = 
(8,5) are Si = (s?'),_,, and $; = (s?") _, that i) IY = W” for all p=2, 3, and 
ii) $; = arg maX» p23 1 (py?! — s?'') subject to the incentive constraints (9) 
and (11), subject to the participation constraints and subject to competition 
constraints that ensure i attracts the first-ranked agents. The competition con- 
straints are given by the following condition. For any vector of wages s" that 


differs from 5 only in wages offered by j (possibly, both) s?' = s?’ + s?" such 
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that I? > II}, then 0,(5',a') < 0,(5',a') (any move allowing j to hire first-ranked 
agent(s) would be detrimental to j). 


Proof. See appendix. 

Assumptions 3. A3.1) For all vectors of wages s! identical to s' with the 
exception that, at any given stage p (possibly, both), s? "is equal to 0 instead of 
se ', we have 0,(s",a') < 9,(s',a'). A3.2) For all vectors of wages s’"' identical to 
s' with the exception that at any given stage p (possibly both), s’'' is equal to s®' 
instead of 5”', we have 0; (s"", a’) < 6; (s', a’). 


Assumption 3 simply rules out trivial and uninteresting scenarios. It states 
that, at the equilibrium and at both stages, the university which has not the 
reputational advantage prefers to hire the second-ranked worker rather than 
hiring no-one (A3.1); and the university which has the reputational advantage 
prefers to hire the first-ranked worker at the equilibrium wage at both stages 
rather than hiring the second-ranked worker at the wage where this agent 
achieves her outside option utility (A3.2). 

Theorem 2 shows how the rivalry with the other university places a com- 
petition constraint on the leading university. The competition equilibrium is 
said to arise when the leading university saturates the competition constraint. 
In this scenario, the leading university sets its third stage wage 5° such that 


(16) U(s' + B) = US") + 9, 


that is, the first-ranked agent, at the beginning of the senior stage, would have 
an equal satisfaction whether being hired by the less reputed university, which 
would offer (in addition to the wage that saturates the participation constraint 
s°“) to the agent all the value of the productive (direct and delayed) premium 
this agent would bring in’, or being hired by the leading university. 

The leading university sets its second stage wage 5 such that 


(17) U + B) = U(5™) + p + 6,[2G(b) - 1](U(S™") + p — U(s™)). 


At the beginning of the junior stage, the first-ranked agent would again have 
the same expected utility whether being hired by the less reputed university, 
which would offer the value of its productive premium (B), or being hired by 
the leading university and benefiting from both the direct satisfaction of 
working there (g) and the discounted surplus of expected utility due to the 
increased chance to win the next contest. 

Nevertheless, the incentive constraint is still effective because the uni- 
versities’ employees consistently anticipate that the wage offers are constant 


15 B is the maximal amount the non-leading university is ready to offer to the first- 
ranked agent on top of the wage that it would give to the second-ranked agent. 
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in time. The leading university knows that the second period wage it offers has 
an incentive effect on the agent who is currently employed by the university at 
Ph.D. stage. Similarly, the leading university knows that the senior wage it 
offers has an incentive effect on both the Ph.D. and the junior who are 
employed in this university. Thus, there is no reason to exclude a priori a 
scenario in which the competition constraint is not saturated due to pure 
incentive purposes. The leading university wages might differ depending on 
whether the competition constraints are saturated or not. 

Let us define (s”'*),_,, as the full incentive maximizing wages of the leading 
university i, which are solutions of the optimization program of theorem 2 but 
now irrespective of the competition condition. If 5”* > s?', Vp = 2,3, then 
the competition constraint is not effective and the full incentive maximizing 
equilibrium is said to arise (none of the incentive constraints is saturated). If 
<5 ands** > 5, the mixed equilibrium arises. The incentive effect is then 
prevalent only for the senior wage!‘ while the junior wage offered by the 
leading university saturates the competition constraints. 


Corollary 4. Theorem 2 leads to three possible Markov Perfect Equilibria 
which differ only in the wages offered by the leading university depending on 
whether the competition constraints are saturated: i) the competition equili- 
brium, where ®' > §?.'"* =2, 3; ii) the mixed incentive equilibrium, where 5°'* < 
3 and $* > 5°, and iii) the full incentive maximizing equilibrium, where 5 ** > 
Pt p=2, 3. 


Now we investigate the long run implications of theorem 2, which shows 
that the leading university always attracts the first-ranked scholars. There is, 
thus, a path-dependent process since the equilibrium of the university com- 
petition game preserves the competitive advantage of the leading university 
which tends to some fixed value as stated in corollary 5 below. 


Corollary 5. The MPE of the Universities Game preserves the competitive 
advantage of the leading university over time: the most reputed university hires 
the winning agents at junior and senior stages and thus conserves full advantage 
over time. In the long run (t — &) the endogenous competition advantages the 
leading university confers to its employees tend to b = B(1+ y)/(1 — y) for an 
infinite reputation relevant period T. 


Proof. According to theorem 2, the leading institution hires at both stages 
the first-ranked agents and thus preserves its advantage for ever. When t tends 


16 Senior wages have a higher incentive effect because they affect positively both the 
Ph.D. and the junior stages efforts, while the junior wages only affects agents holding a 
Ph.D. position. 
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to infinity, any productive premia the non-leading university might have had 
(by any kind of accident) tends to zero (due to discounting). Then, the pro- 
ductive advantage (Ab?) of any agent employed by the leading university is 
strictly equal to the productive premium (bP*). In the long run, when t — x, it 
also becomes invariant in time and is computed as follows: 


relevant period T tends to infinity. 


5 Welfare analysis 


This section is dedicated to the welfare analysis. Optimal wages are computed 
in the first subsection under some simple specifications of the utility, disutility 
and production functions. In the second subsection, we compute the equili- 
brium wages given these specifications and study how they relate to optimal 
ones. 


5.1 Optimal wages 


We assume that the social surplus created by the academic activity is simply 
obtained, through function ®, as the actualized sum of the individual pro- 
ductions times a given parameter (¢ > 0 which gives the per-unit social value 
of scientific knowledge, which is assumed to be homogeneous (or normal- 
ized).'’ Thus, the total surplus generated, actualized at period tọ=0, is given 
by 


3 


(18) P= E Sloy- 8"), 


t=0 k=i p=1 


with 6 the social discount factor. 

The program of the central planner is to set the optimal recruitment scheme 
at each period and stage. We assume that the central planner has exactly the 
same level of information as universities: at each period, it can only use 
ordinal information on the previous period ranking.'* The planner naturally 
offers the best positions to the first-ranked agents (as universities competition 
does) in order to fully preserve the career incentives. It sets the optimal wages 


a2 x2 


as follows: S' = (2, 2, 5,58) = argmax,®(S') under the incentive constraints 


17 Notice that we do not assume that the social value of knowledge ¢ and the value 
considered by the universities y are identical. 

'8 Therefore, our analysis can be seen as a second-best approach relative to an 
approach assuming omniscient central planner. 
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given in (8) and (10) and the participation constraints. By convention, wages 5? 
are offered to the first-ranked agents and wages $ are offered to second- 
ranked ones. 

We now specify the functions U, V and f? according to their properties given 
in section 2. The utility function is simply assumed to be U (s)=In s. The 
disutility function is assumed to be quadratic in efforts: V (e)=3ce?. The 
production functions of scientific knowledge are assumed to be linear in 
efforts: f? (e)=(uf! (e)=uae. The strictly positive parameter u, gives the 
increase in agents’ productivity between the two first periods of their career. If 
we have u > 1, then agents’ productivity increases through the career path. Let 
also the Ae“ be identically distributed across the different periods of the 
career, that is g! = g, Vd. 

We focus on the long run wages (t— x), for which we know that 
Ab?! — b, p = 1, 2, 3. Again, the long term wages are consistently expected 
to be stationary by agents. They anticipate that the next period wages will be 
the same they observe in the current period. Given such anticipations, the 
central planner has no reason to modify the long run wages in time. 

The central planner sets the lowest optimal wages at junior and senior 
stages so as to saturate agents’ participation constraints. Thus, the wages of the 
second-ranked agents at both stages are 3? = exp (W2)"? and 5° = exp (W°). 
The wages offered to the first-ranked agents are simply derived from the two 
FOCs of the central planner program solved for 5? and 5° (detailed compu- 
tations are in the appendix): 


3 a 
(19) 5 = 26 — dug(b) 


# = 26 Č 8,9(6)(8,(2G(b) - 1) +1) 


The optimal first-ranked junior wages decrease with the cumulative advantage 
b. The effect of b on $ is ambiguous. The optimal wages also increase with the 
productivity of agents’ efforts (a), the per-unit social value of scientific 
knowledge (#), and agents’ discount factor (6,). They decrease with the per- 
unit cost of efforts (c). 


1 We assume here, so as to simplify the notations, that agents compare the second 
stage outside option with the current period utility (and not with the whole career 
expected utility flow). In short, agents compare W? with U? instead of IT’. 
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5.2 Comparing optimal and equilibrium wages 


Let us compute the equilibrium wages given the specifications introduced in 
the previous subsection. As stated in theorem 2, i), the lowest equilibrium 
wages saturate the participation constraints: s? = exp (W?), sè = exp (W°). 
These wages are strictly identical to their optimal counterparts. Let us now 
focus on the competition equilibrium. The equilibrium wages of the leading 
university are simply obtained by using equations (16) and (17) and intro- 
ducing the specifications: 


2 = exp{In(e™ + B) [L — 6,(2G(6) - 1)] - P-8.(2G(b) — 1) (9 -#°)} 
(21) 


(22) S= (P+ Ber 


These wages saturate the constraint that the leading university attracts the two 
first-ranked agents for any profitable offer of the opponent university. The other 
university is ready to offer a wage to a first-ranked researcher up to the totality 
of the spillovers and reputation premia he would bring in. The first-ranked 
researcher accepts the position if such a wage provides her a higher utility than 
the satisfaction to be in the most reputed university and (at the junior stage only) 
the surplus of expected utility due to the higher probability to win the next 
contest. Thus the leading university offers a wage to the first-ranked agent so 
that the wages the other university should offer to attract her are sufficiently 
high so that it prefers to hire the second-ranked agent at the best rate. 

If the leading university prefers to pay more to its employees just because 
this increases its payoffs due to the incentive effects of higher wages on its 
current employees at previous stages, we are in the full incentive maximizing 
equilibrium. It can easily be shown that the wages offered by the leading 
university in this case (set irrespectively of the competition constraints) can be 
simply derived from the optimal wages as follows: 

et Yy =2 3 
(23) 2 
Universities value scientific knowledge production at rate y instead of ¢, and 
they only take into account the incentive effect of wages on their own 


employees. If xe > P,p =2, 3 and if the university’s associated returns are 


higher, then the university sets the wage offer so as to maximize the incentives 
provided to its currently employed Ph.D. agents. Notice that the equilibrium 
wages will be lower than their optimal counterparts only if y > 2¢, that is, if 
the universities value knowledge at least twice its social value.” 


2 For space constraints, we do not examine the mixed equilibrium here, the analysis of 
which does not bring much. 
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Proposition 6. Optimal and equilibrium wages of the second-ranked agents 
at junior and senior career stages are equally set so as to saturate agents’ par- 
ticipation constraints. The competition and full incentive equilibria wages 
offered to the first-ranked agents can be either greater or lower than the optimal 
ones depending on the values of the parameters. 


Proof. The proof results trivially from a comparison of (21) and (22) with 
(19) and (20), respectively, and from considering (23) for the full incentive 
maximizing equilibrium. 


6 Conclusion 


In this paper, we have introduced a model of academic competition which 
intends to capture both the life-cycle effect and the cumulative advantages 
effect of the academic reputation-based reward system on individual incen- 
tives. We have suggested a mechanism according to which such an effect is 
rooted in the employment relationship. Research positions are intrinsically 
unequally productive, and the allocation of the best positions is based on a 
ranking of past scientific productivity. Unequal productivity is essentially due 
to some positive externality high-ranking agents have on the scholars 
employed by their university and to a positive effect of the accumulated 
reputation of the employing university, which is endogenously determined by 
past successes in the recruitment of the most reputed scholars. 

Our results highlight that the cumulative advantage has negative effects on 
incentives at each stage of the career but the first, at which the effect is 
ambiguous. The most important results of this paper concern the other side of 
the coin, namely, competition between universities. In equilibrium, the leading 
university always hires the first-ranked agents at junior and senior stages. The 
cumulative advantage the leading university confers to its employees is 
endogenously generated in the long run and is stationary. The wages offered 
by the non-leading university are optimal. There are three possible equili- 
brium wage offers by the leading university. In the competition equilibrium, 
the leading university sets the wages so as to saturate the competition con- 
straint which ensure it hires the first-ranked scholars. In the full incentive 
equilibrium, the leading university sets wage offers so as to maximize the 
incentives provided to the agents it currently employs at previous stages. In 
the mixed equilibrium, only the junior wage saturates the competition con- 
straint. 

In all cases, there is no reason why the equilibrium wages offered to first- 
ranked agents should correspond to the optimal ones. In the competition 
equilibrium, the leading university cares only about the capacity of the 
opponent university to attract first-ranked agents. If competition between 
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universities is very low, the full incentive equilibrium is likely to arise. Then, 
the leading university may not consider the competition constraint and rather 
focus on the provision of incentives to agents (as in the optimal scenario) but 
in a very specific and partial manner. The leading university does not take into 
account the incentives its wages have on the other university’s employees. 
Moreover, since universities can control neither efforts nor production at any 
stage, no incentive can be provided to their seniors, and junior wages impact 
only on the Ph.D. they hire. It is only if the leading university values knowl- 
edge twice more than its social value (which is rather unlikely) that the full 
incentive equilibrium wages of the leading university are higher than their 
optimal counterparts. 


Appendix 
Computation of the second stage Nash equilibrium 


The first order condition of program (8) is 


(24) ö.(AU? + p) x OP (y? > y?)/Oe? = ƏV (Ee?) /öe}. 


Notice that the probability that i wins the second contest is given by 
P(y > yj) = 
P (f? (2) + AB? + Ae? > f (2) = [1 - @ (P(e) - f? (2) — AB)). 


When differentiating that expression with respect to j’s efforts, one obtains 


OP (y; > y;)/Oe? = F” (e7) x P (P (6) -P (e) — Ab”). 


Introducing this expressions in the first order condition (24), one gets 


FP (8) x P P E) - Fe) — Ab*)6, (AU? + g) = V' (ei). 


Let us define the function ©, by ©, (x)=V’ (x)/f” (x). This function is defined 
on R‘,O,: (0,00) — (0,0). Since V’ (0)=0, this function is null at 
0 (©, (0) =0). Moreover, since V’>0, V” >0, f” >0, f” <0, it can easily be 
shown that this function is strictly increasing: V, > 0. Thus, its inverse function 
5": (0,00) — (0,00) is also increasing. Then, one can rewrite the two first 
order conditions using these new notations: 


©, (2) = ô, (AU? + g)¥ (P (e) — f? (e) — Ab?), 
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©: (e) = ò, (AU? + p)F (Per) -F(e) — Ab’). 


Given the assumptions formulated so far, these two equations are of the form 
e; = h(e}) and è = h(e}) with A a strictly increasing and continuous function 
on R*+. Therefore, if an equilibrium exists, it is necessarily symmetric of the 
form e; = e? = e. This equilibrium then satisfies the following expression: 


©; (e) = 6, (AU? + p)g’ (Ab?) 


Since ©, is strictly positive, null at 0, and strictly increasing, this equation 
admits a unique solution. Moreover, since g >0, AU? >0 and f?>0, this 
solution is strictly positive. The unique symmetric second stage Nash equili- 
brium is thus given by 


E = 0;' (ô, (AU? + 9) ¢ (Ab*)). 


Computation of the first stage subgame perfect Nash equilibrium 


Let p; denote the probability that, if the agent employed by university i has 
won the first stage contest, he will also win the second stage contest. Since the 
second contest is influenced by the results of the first, p; is a conditional 
probability. Since the second stage equilibrium efforts (9) are identical, that 
conditional probability is independent of agents’ identity {p =p;=p,). It can 
be computed as follows: 


p=P(e?+e,+ Ab? > e+e) = 1— P(Ae < — Ab’) = 1- G? (- AD’). 


Referring to the assumption that g is symmetric around 0, one can write: 
p = G?(Ab?). 

We denote by AIT? the surplus of expected utility received from stage p, 
inclusively, over the remaining career cycle that results from winning the 
contest at stage p; formally, 


AI? = IP 


P 
yoy — TE 


i 


’ 
Pc yP 
u 


where IT? Leow is the expected utility of agent i at stage p conditionally on i 
> 
winning period p’s contest. 
Using these notations and definitions, we can rewrite AIT' as follows: 


+p) + (1-P)U] - 6,0 - & 
(U3 + p) + pU’) 


IT = 6, (U? +9) + ô; [p? (U° 
[0 - p’) 


After a few simplifications, we get 


= 6, (AU? +9) + 62. (2G? (Ab?) — 1) (AU? +9), 


An Economic Theory of Academic Competition 199 


with AU? the difference in utility between having won the first contest and 
having lost it. Introducing that expression in the first order condition of the 
first period maximization program (10), we get:?! 


O7! (e!) = g! (Ab!)ö, [AU? + ô, (2G? (Ab?) — 1)AU?], 


with ©, defined analogously to ©,, that is, ©, (x)=V’ (x)/f” (x). The equili- 
brium is symmetric and unique for the same reason given in case of the second 
period. The final expression of the equilibrium efforts (11) follows. 


The incentive properties of the cumulative advantage 


Here, we study the effects of the competitive advantages at the first two stages 
of the career (Ab! and Ab?) on the equilibrium efforts (é' and £?). 

The second period efforts are independent of the first period advantage. In 
order to characterize how the cumulative advantage Ab’ affects the second 
period equilibrium efforts, we differentiate both sides of (11) with respect to 
Ab: 


de /OAb? = g” [Ab?]ð, (AU? + p) x O7 (P (Ab?)d, (AU + G)) (<0) 


We know that (OJ is an increasing function. Moreover, since Ab?>0, and 
since g has its unique extremum at 0, g’’ (Ab?) is strictly negative. Thus, we 
can conclude that 0é?/OAb? < 0. 

The first period equilibrium efforts are functions of both the first and sec- 
ond stages cumulative advantages. From (9), we obviously have 0e!/OAb! <0, 
since ©;! is increasing, Ab! > 0, and g! (x) decreases for all x > 0 and, thus, 2G? 
(Ab?) —1>0 (g! is symmetric around its unique extremum at 0). As regards 
the effect of the second stage advantage on the first stage efforts, we differ- 
entiate both sides of (9) with respect to Ab?. We get 


de! IAH? = 2g! (Ab!) P (Ab?) 6; (AU? + g) 
x@;" (g! (Ab!)6, [AU? + p + 6, (2G? (Ab?) — 1) (AU? + @)]) (> 0). 


The second period bias has a disincentive effect on the second period efforts 
while it increases first period efforts. 


71 The symmetry of the density function around 0 preserves the symmetry of the 
equilibrium since g (Ab?) = g?(—Ab?), p=1, 2. 
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Proof of theorem 2 


The proof has two parts. 

a) Itis an MPE. Let us show that university j has no incentive to deviate. If 
j offers a higher wage at any stage, two cases may arise. This increase is not 
sufficient to attract the first-ranked agent. Then j has higher costs but returns 
remain unchanged. If the increase is sufficiently high, university j attracts the 
first-ranked agent but, according to ii), js expected payoffs are lower. If j 
decreases any wage offer, then the second-ranked agent has an incentive to 
deviate and leave the university system. The university does not fill the 
position and gets a lower return according to A3.1. Thus, j has no incentives to 
move. Let us show now that university i has also no incentive to deviate. If i 
increases any wage, revenues remain the same while costs increase. If i 
decreases wages by any increment, then j reacts by setting a wage which allows 
it to hire the first-ranked agent and increase its returns. This would of course 
sharply decrease i’s payoffs. Thus, i has no incentive to move either. 

b) No other MPE exists. Excluding the above mentioned MPE equilibrium, 
there are four possible situations for any given stage p (which can be treated 
independently) that can be categorized by comparing the wages offered by 7 
and j to those offered at 6’. 1) Both universities offer higher wages at stage p: 
If js offer is not sufficient to attract the first-ranked agent, then it has a 
incentive to reduce its offer. If it is sufficient to attract the first-ranked agent, 
then j has an incentive to reduce its offer because its payoffs are then lower 
than the payoffs from just setting its offer to s’‘ and hiring the second-ranked 
agent. This would already be true if i were setting its wage offer at 5?‘ (con- 
dition ii). Now that i makes an even higher offer, it is clearly also true, and j 
has an incentive to deviate. 2) University j offers a higher wage at stage p and i 
a lower wage: If j’s offer is not sufficient to attract the best scientist, j clearly 
has an incentive to deviate. If it is sufficient, then i has an incentive to increase 
its offer so that it will reach the threshold given in ii), that is, where j will 
prefer reducing its offer and aim at hiring the second-ranked agent. 3) Both 
universities offer lower wages at p: Then j certainly cannot hire any agent and 
gets lower payoffs according to A3.2. University j deviates. 4) University j 
offers a lower and i a higher wage at p: Then j certainly cannot hire any agent 
and gets lower payoffs according to A3.2 and i pays a higher salary without 
compensation. University j deviates. In all situations which differ from the 
MPE, at least one university deviates. Then, there is no other equilibrium. 
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Computation of the optimal wages 


The program of the central planner can be rewritten as follows: 


max ® = = O° (26 [f (E) + P (ei) +P) + 38] 


S!vt>0 
+3pAb?* _ [2s + get + get + end 4 $7]) 


Given the specifications introduced in section 4, we have ©,(e) =—e and 
2 a 


c . 2 a 
O,(e) = —e. Moreover, one obtains f'o © (x) = oe and 


ua 2 
oO, We wa, Then ® becomes 


C 


= (2 = ö,9(AbP*) (Ins — Ins?" + o + ,[2G(Ab"*) - 1] 


2,72 
(In 5% — Ins" + 9)). + d,g( Ab") (in S — Ins + gp) + 3e 


+3pAbrr = [2s!* + get + 527 + go + =) . 


In the long run, when t — x, the cumulative advantage becomes stationary 
Ab?" = b, as shown in corollary 5. Then, the optimal wages also become 
stationary: $7 = $ and #"=3, p=2,3. The first order conditions of the 
central planner’s program, solved with respect to the lowest wages, lead to 
negative values. Then, the central planner saturates the participation con- 
straint of the second-ranked agents at the two stages considered: s$? = exp(W?) 
and s = exp(W°). After some computations and simplifications, the first 
order conditions computed with respect to the highest wages lead to the 


equilibrium wages given in (19) and (20). 
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Comment by 


DOMINIQUE DEMOUGIN 


Economists and organizational theorists have long overlooked the study of 
their own institution, the university. What precisely do universities produce? 
What is their objective function? Why are they structured the way there are? 
What is the best way to organize them? Align incentives? The enormous 
increase in higher education together with budgetary restrictions, increased 
systems competition, the importance of human capital, in particular, in 
research and development, brain drain, and more, all of these things should 
induce organizational researchers to analyze the institutions of higher edu- 
cation. Nicolas Carayol’s paper is a welcome step in that direction. 

Carayol aims at explaining two phenomena observed in the university 
systems of most countries and referred to in his paper as the life cycle and the 
cumulative advantage effects. The first effect simply recognizes that for most 
professors, the early phase of their career is usually also the more productive 
in terms of research. This observation seems to be true independently of the 
country examined despite very different university systems. The second effect 
refers to the impact of reputation which seems to afford a durable advantage 
to institutions endowed with it. The paper shows that both effects can be 
explained as resulting from an optimal mechanism problem in the face of 
moral hazard difficulties. 

Intuitively, suppose universities benefit from the research output of their 
researchers and researchers benefit from the reputation of their respective 
institution. At the end of their doctorate, more reputed universities will be 
able to attract the best young professors. In addition, due to the moral hazard 
problem on the side of professors, universities will offer them an incentive 
scheme. Here the paper imposes a very strong restriction assuming that uni- 
versities cannot offer incentive schemes based on output. Rather each period 
the universities make competitive wage offers to their academic staff. Given 
the competition between the universities, researchers are faced with a form of 
tournament. However, given that the universities do not have the same rep- 
utation the tournament is biased, favoring last period’s winner. 

The setup provides a straightforward explanation for the inverse U-shaped 
productivity. Naturally, in the three periods framework analyzed by the 
model, the winner in the doctoral phase goes to the better institution. 
Therefore, due to the positive externality of reputation, his next period’s 
productivity increases. However, in the third period the researcher does not 
have any incentive to work since the wage scheme only rewards last period’s 
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output. Altogether, it guarantees the inverse U-shaped result. More generally, 
the reputation effect would allow us to generalize the conclusion to a model 
where academics live longer than three periods. Intuitively, the effect would 
obtain because reputation is important to the academic in the early phase of 
his career, but evidently becomes less important over time. The cumulative 
advantage effect is also very intuitive; better institutions can more easily 
attract promising academics. They in turn have higher productivity (also due 
to the reputation effect). On average and over time, the mechanism only 
reinforces the reputation on the institution. 

The paper suggests a few interesting conclusions. For the remaining dis- 
cussion, we will need to keep in mind that universities produce more than just 
research, e.g., teaching and administrative tasks. These other tasks are not all 
as difficult to measure. For example under the old German university system, 
professors received a bonus related to the number of students participating in 
their class. In such a context, the foregoing analysis suggests that universities 
should provide incentives for older colleagues to shift the emphasis of their 
work away from research and more towards those more easily measurable 
activities. For example, in the current German system faculties often nominate 
relatively young professors to become chairman while some of the older 
colleagues are free from administrative work. Given the inversed U-shaped 
effect described above, this is most certainly a waist of human capital. From 
the point of view of the German university system and according to Carayol’s 
model, this should have two negative effects. First, it reduces the accumulated 
reputation since some of the human capital is wasted. Moreover, it suggests to 
the better German academics that they would be better of starting outside of 
the German system, for example by spending the early part of their career in 
the US. Again from the point of view of the overall system, this effect is 
negative reducing their current reputation and through the inter-temporal 
feedback also future reputation. 

In addition, the paper makes clear that universities may have an advantage 
providing tournament contracts, instead of solely creating incentives by the 
use of outside offers. As discussed earlier, the outside offer scheme distorts 
incentives, particularly in the latter part of the career.! A tournament scheme 
between academics of single institutions would require for many countries a 
significant departure from current practice. Since output is not easily verifi- 
able from the outside (e.g., how would an economist be able to judge the 
research output of colleagues in a medical faculty and vice et versa), it would 
require either a complicated mechanism with outside referees or delegating 
decisions to a “powerful” chairman. The later mechanism only functions as 


1 Tt is likely that distortions are not only found in the latter part of the career. Without 
solving for the first best solution, I would presume that in the early phase researchers 
may, under some conditions, get involved in a rat race contest. 
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part of a reputational equilibrium. In that respect, it is interesting to note that 
North American universities (i.e., Canadian included) have been doing this 
for quite a while. In practice, it seems to require both a powerful chairman and 
outside referees. 

The paper also emphasizes that an intelligent emeritus policy may be 
advantageous. It would allow universities to increase research incentives even 
in the last years of a colleague’s career. For example, granting privileged 
access to research facilities and libraries, office space, secretarial work etc. are 
all important ways to reward colleagues for their commitment to the last 
period. 

To conclude, I would like to discuss a few critical aspects of Carayol’s paper, 
thereby suggesting future avenues of research. First and foremost, the paper 
introduces a very ad hoc objective function for universities and for professors. 
Not withstanding the importance of research, universities are also focused on 
teaching and require a management structure. Due to particularities of higher 
education, universities are usually managed by academics themselves. What 
would be important is to create a link between “teaching and research” output 
and revenues, or, alternatively, directly with society’s welfare.” Regarding the 
description of researchers, an extension including multiple tasks would appear 
essential and promising. Finally, considering alternative hypotheses may also 
be useful. Suppose, for example, that the incentive problem for research can 
be easily solved, either because research output is more easily verified than 
often assumed or because of intrinsic motivation. Moreover, suppose that 
academics decide on their specialization during their doctoral phase and that 
changing specialization later on is too costly. Finally, assume that the “hot” 
topics follow some random drift and that better universities have better 
information on the drift. First, this would explain why doctoral candidates 
want to go to top universities and why graduates from top institutions do 
better on the job market. It would also explain the inversed U-shaped 
research output. Intuitively, over time “hot” topics shift away thereby 
reducing the output of researchers. For example, the benefits of having done a 
Ph.D. with Debreu vanished in the early 80s. Last but not least, it also explains 
why the reputational advantageous are long lasting. 


? To emphasize the importance of this point, consider the case of French faculties in 
economics. Clearly outside of France most would agree that Toulouse is currently the 
best French research faculty. Nevertheless, recent decisions by the French ministry of 
education seem to favor institutions in Paris for the development of a top Ph.D. Program. 

3 For example, because the editorial boards of the top journals are selected from 
renowned universities. These assumptions would also explain why universities often 
subsidize journals by allowing their researchers to take editorial position and often even 
rewarding them for doing so. 


Research Networks — Origins and Consequences: 
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Nanotechnology and Micro-economics in Germany 
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1 Introduction 


There is a fast growing discourse on the potential benefits of networks for the 
research process in sociology, business administration and economics. 
Important concepts are the idea of the emergence of a new mode of knowl- 
edge production, the concept of social capital and its role for the production of 
knowledge or the idea of creating critical mass in research. These ideas have 
found their way into science policy recommendations and programs such as 
networks of excellence or the promotion of interdisciplinarity and of uni- 
versity-industry-cooperation by research funding organizations in Germany 
and abroad. 

This paper presents concurring hypotheses on why networks might have a 
positive impact on research performance. Preliminary evidence from a 
quantitative and qualitative study of three subfields, astrophysics, nano- 
technology and micro-economics, is presented. These fields are characterized 
by input and output indicators with a special focus on the structure of net- 
works and on strategic network behavior. Next to a bivariate analysis, four 
preliminary models relating input, networks and output are presented. The 
data were collected in a research project which is part of a larger research 
group dealing with the changing governance of the German research system. 


2 The role of networks in scientific production 
2.1 Networks and a new mode of knowledge production 


The information society, the knowledge society and the network society are 
metaphors trying to catch important characteristics and changes in modern 
societies. One of these metaphors is the idea of a new mode of knowledge 


* The members of the project team are Dorothea Jansen (principal investigator and 
speaker of the research group), Andreas Wald and Karola Franke at the German 
Institute for Public Administration. Information on the project and the larger research 
group is available at www.foev-speyer.de/governance/. Funding by the German Research 
Association is acknowledged (DFG FOR 517: Ja 548/4-1, Ja 548/5-1, Ja 548/5-2). 
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production (Gibbons etal. 1995, Nowotny etal. 2001). Scholars from the 
sociology of science postulate that scientific knowledge today is no longer the 
domain of scientific disciplines and academic actors. Instead it arises from 
distributed production connecting producers and users from different societal 
subsystems. Transdisciplinarity and an orientation toward application, col- 
laboration and networking between these actors are supposed to become vital 
for the production of knowledge and for its legitimacy. The proponents of 
these contested theses (for a critique c.f. Weingart 1997, Krücken 2001, 
Slaughter and Rhoades 2004, Trute 2005) expect that users instead of aca- 
demic disciplines will have a larger say in the definition and evaluation of 
research programs. The trend for this new mode of knowledge production is 
supposed to be particularly strong within fields such as biotechnology or 
nanotechnology while traditional fields such as astrophysics are seen to hold 
on to disciplinary lines. 

Although there is still little systematic empirical evidence of a positive 
effect of a mode-2 type of research and collaboration profile on academic 
performance,! the mode-2 ideas soon became topics in the political debates on 
reforming the German research system. Shortcomings in quality and quantity 
of research output, in competitiveness and innovativeness of the system were 
attributed to a deficit in collaboration and networking between disciplines, 
between different types of research organizations, between academic and 
industrial actors and between basic and applied research. More collaboration 
and heterogeneous collaboration and networking are asked for by more and 
more funding agencies and programs (DFG 2003, Wissenschaftsrat 2003). I 
will try to give a tentative answer to the question of whether enforcement of 
heterogeneous networks or of an applied research strategy actually does 
enhance scientific productivity. In the next paragraph, I argue that networks 
can be an asset and deliver social capital, but can be a social liability as well. 


2.2 Social capital and social liability from research networks 


The main producers of new knowledge today are not individual researchers or 
entrepreneurial inventors but research groups collaborating within and across 
organizations. These groups are embedded in different types of organizations 
(academic, government, industry), disciplines and industry sectors. New 
knowledge and especially basic innovation and new paradigms emerge mostly 
at the margins of disciplines, organizations and sectors (cf. Hippel 1988, 
Blackler et al. 1998, Nahapiet and Ghoshal (1998). It is produced by combi- 
nation and exchange. This is why embeddedness into research networks via 


' Negative effects found by Evans (2004); no effects found by Gulbrandson and 
Smeby (2005) and Heinze (2004). 
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research contacts, the flow of information, knowledge pieces, materials, 
instrumentation and people can be treated as a kind of social capital. 

I define those aspects of a network structure? that open or constrain 
opportunities of action for individual or corporate actors as social capital. 
Social capital can be converted into other forms of capital. In the social 
network literature, five types of benefits from social capital are distinguished 
(Jansen 2002, Lin et al. 2001): information and (tacit) knowledge, trust into 
and enforcement of norms, structural autonomy, entrepreneurial profits from 
arbitrage, and finally social influence coming from legitimacy and reputation 
attributed by other relevant actors. The benefits accrue to individual and 
corporate actors, or to groups of actors within a social structure. 


Figure 1 The bases of social capital: 
weak ties vs. strong ties, structural holes vs. dense clusters? 


The different benefits from social capital are supposed to be based on 
different types of ties and structural configurations. Structures or positions, 
which are beneficial in one regard can be detrimental for some other goal. The 
main structural differentiation is between so called strong and trusted ties 
(solid lines) in densely knit networks and so called weak ties (dotted lines) in 
sparse extended networks. A sparse network yields information and structural 


? Networks in a methodological sense consist of a set of nodes (actors, events, ideas) 
and the edges/ relations that are defined on them (e.g. information flow, influence, 
membership). This concept is related to but not identical to the governance focused use 
of the term in transaction cost economics (Williamson 1991) or in neoinstitutional 
sociological approaches (Powell 1990, Podolny 2001, Jansen 2002). 

3 Adapted from Burt (1992, 27). 
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autonomy for a broker. Brokers can bridge “structural holes” (e.g. between 
EGO and the three dense clusters) and thereby combine diverse information/ 
knowledge, transfer knowledge or extract arbitrages from otherwise uncon- 
nected ties. While Burt (1992) in his theory of structural holes claims that 
dense networks mean constraints and inefficiency for an actor, other scholars 
underline the positive effects of dense networks with easy going collaboration 
and knowledge flow, high trust and low transaction costs (Coleman 1990, 
Powell 1990). 

Empirical studies show strong tendencies of interorganizational and personal 
networks towards homophily and stability. In-depth studies of collaboration 
patterns (Uzzi 1997) as well as longitudinal quantitative studies of alliance 
formation (Gulati and Gargiulo 1999, Todeva and Knoke 2002) report that 
network formation is guided by previous experience with partners or partners 
of partners. Strong and embedded ties tend to go together with high returns to 
an actor in the form of stability, profitability, successful innovations, access to 
tacit knowledge and to finance (Uzzi 1997, Talmud and Mesch 1997, Ingram 
and Roberts 2000, Hansen 1999, Ahuja 2000, for a review see Jansen 2002). But 
there is also evidence that an overdose of embeddedness into networks can 
hamper innovation and produce too much confidence into established routines 
and products (Burt 1999, Kern 1998, Henderson and Clark 1990). 

Studies of academic research networks at the micro level are mostly based 
on bibliometric data, i.e. copublication analysis. They confirm a positive effect 
of network embeddedness, particularly of top level and international ties on 
scientific output and impact (Frenken et al. 2005, Adams et al. 2005). Struc- 
tural information (clustering, density, brokerage positions) are seldom 
reported in bibliometric analysis. 

Thus the central question is which type of tie and which type of network is 
more successful in knowledge production in the long run. Will trust breeding 
cliquish networks bring about stability at the cost of innovation and learning 
capacities? What is the effect of brokerage between cliques? Since network 
structures and ties that work in the exchange of codified and public knowledge 
may not work in the transfer and creation of implicit or proprietary knowl- 
edge, the ultimate question will probably be how to balance both types of ties 
and stability and openness of networks. 


2.3 Learning and network strategies as governance mechanisms in networks 


There are two important cognitive variables which can explain the effect that 
networks will have in the long run. The capacity for the production of new 
knowledge depends on the absorptive capacity (Cohen and Levinthal 1990) of 
a group. Only those working close to new knowledge can grasp its relevance. 
This means that a wise learning strategy must invest into the monitoring of 
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research areas other than the current ones. Whether a research group is able 
to do this depends on its size and organizational slack. 

Next, a wise network strategy should try to avoid the ossification of net- 
works. Open and strategic choice of complementary skills and synergy of 
research capacities should be an important argument in network formation. 
But such a strategy rests on the provision of a functional equivalent for per- 
sonal trust as the heretofore prominent governance mechanism in networks. 
Systems trust into more abstract role structures and positions must be built up 
endogenously by the network actors. 

Centrality and prestige of actors might be able to substitute for personal 
experience (cf. Powell et al. 1999, Stuart 1998, Podolny et al. 1996). Actors in 
central positions tend to attract collaboration offers without such previous 
experience. In particular, actors in the center of role structures succeed in 
combining high centrality and prestige with a broker position that attracts new 
ties (cf. Jansen 2000 and 2004, Darr and Talmund 2003, Obstfeld 2005). They 
are the ones that connect heterogeneous partners. But their position does not 
— as Burt would have it — support unconstraint brokerage and arbitrage. On 
the contrary, they are caught between two or more groups or cliques 
(Krackhardt 1999). They have to integrate divergent demands, research cul- 
tures, and disciplinary views. At the same time they are highly visible. Rep- 
utation effects are strong for them and they have a lot to loose. Transitive role 
structures and actors in between several cliques might very well work as 
governance mechanisms that support trust in an open social structure with 
changing actors. Those with high influence and prestige act as trustees who 
can prevent opportunistic behavior in networks by informal more or less 
horizontal control and sanctioning (cf. Wittek 1999, Lazega 2000). 

This idea of stable transitivity and changing actors within an open social 
structure can only work if actors build their network not only on past expe- 
rience but on a forward looking calculation of potential trustworthiness 
(Buskens and Raub 2002). Since ossification of networks could thus be 
avoided the strategic attitude towards networks should have a positive 
influence on the absorptive capacity for new ideas and on research per- 
formance. 


2.4 The role of funding organizations and research policy 


The political quest for building networks comes along with another reform 
issue: the idea of strategic concentration of funds on selected research pro- 
grams. At the meso-level, organizations are advised to concentrate on core 
competencies and sharpen their profiles. At the micro level research groups 
are advised to assemble critical mass by becoming part of a larger research 
network. However, it is not at all clear how an increase in networking, con- 
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centration of internal resources and concentration of external funding will 
influence innovativeness and competitiveness of the research system. 

One of these problems is the choice between generalist and specialist 
strategies by the research groups. What will research groups do when they are 
confronted with a highly volatile research area and concentrated resources? 
Focused funding programs establish a monopolistic demand structure. From 
the point of view of organizational ecology, they constitute a coarse grained 
environmental niche (Hannan and Freeman 1977). Science and engineering 
are characterized by a so-called concave fitness structure, i.e. large differences 
between the demands of different research lines and methods. Concave fitness 
structures ideally should lead to high profits from specialization. In interaction 
with a coarse-grained environment and high volatility, population ecology 
instead predicts a more generalist strategy as a hedge against long periods of 
low fit to the demand structure. De-differentiation and a loss of returns on 
investments in specialization could be the consequence. Generalist strategies 
would also lower the need for external collaboration and networking. 

Another question is how the strategy of research groups depends on its size 
and on resources of the larger organization (Wernerfelt 1984). It might very 
well be that under conditions of resource concentration only large and 
established research organizations are able to profit from specialization by 
internal differentiation and the management of resources and portfolios. 
Networks ideally allow for the bundling of resources. Thus, networks might be 
able to solve the critical mass problem of the smaller research groups. Open 
networks might be able to profit from their heterogeneity and innovativeness. 
On the other hand networks combined with the concentration of resources on 
large programs might lead to lock-in effects. They might undermine the 
emergence of new research lines, which happen to fall outside of focused 
programs and profiles. Enforcing network building and critical mass might 
come to the disadvantage of smaller university groups, who cannot build on 
the support of large and established research institutions. 


2.5 Concurring hypotheses on determinants of research productivity and the 
role of networks 


According to all approaches discussed above the size of networks is expected 
to enhance research performance (H1). Disciplinary heterogeneity of research 
groups (H2), an applied research orientation (H3), and industry collaboration 
(H4) increase research performance in the perspective of the mode-2 theory. 
The structural holes approach in social network analysis posits that hetero- 
geneity of networks and low social control/ constraints in networks have a 
positive impact on performance. Thus it is expected that a large amount of 
industry ties (H4) as well as of international ties (H5) enhance performance. 
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Low network density in ego-networks (H6) is an indicator of efficient net- 
works and low control/ constraints. Sparse networks therefore could support 
high performance. On the other hand, the concurring view on the most 
important asset from networks holds that dense networks will support fruitful 
scientific exchange and knowledge transfer (H6 reversed). From an actor- 
centered learning perspective on networks, the strategic relevance of net- 
works, an open choice of partners and the intake of new ties into networks are 
seen as indicators for a governance mechanism that might avoid the pitfalls of 
closed ossified networks. Open networks are expected to have a positive 
impact on performance (H7). Next, arguments from organizational ecology 
and management posit that specialization and differentiation of research 
profiles lead to higher performance (H8) as well as to larger networks (H9). 
Organizational ecology draws attention to a possible negative effect of a 
coarse grained environment — here a concentration of funding in focused 
programs — on specialization and performance (H10). Finally, from a resource 
based view on organizations we expect that groups from large established 
research institutions such as the Max Planck Society or the National Research 
Centers are in a better position to attract large networks (H11) and to invest 
in specialization (H12). This in turn yields a higher performance (H13). 


3 Networks and scientific performance — preliminary evidence 
3.1 Sampling procedure and data collection 


Three disciplinary subfields were chosen for this study. One of them is a 
typical mode-2 field — nanotechnology - while another one is a typical mode-1 
field — astrophysics. To represent the social sciences disciplines micro- 
economics was chosen. The population includes all German institutions which 
published at least one article in the selected fields according to the Science 
Citation Index (SCI) respectively ECONLIT in 2002 or 2003. Fields were 
technically described by experts from the central data project. The relevant 
research groups affiliated to the institutions listed were identified with the 
help of directories and other information available at the web. The research 
group is defined here as the smallest unit in an organization which conducts a 
more long-term research program. 

The web search resulted in 122 astrophysics groups, 225 nanotechnology 
groups, and 56 microeconomics groups. After a validation with the help of 
experts from academia and funding institutions, samples of size n=25 were 
drawn for each field. Expert interviews with the leaders of these research 
groups were conducted in 2004 and 2005. The interview consisted of a semi 
structured qualitative part and a network inventory for the collection of so 
called ego-networks (JANSEN 2003, 79-85). In addition the interviewee was 
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asked to fill in a standardized questionnaire on input and output data of his 
group. 

Bibliometric data on publications, citations and co-publications at the level 
of the members of each research group were collected and aggregated to a 
bibliometric profile of each group.* For astrophysics and nanotechnology the 
SCI database, for microeconomics SCOPUS with better journal coverage was 
used. 


3.2 Research input and output and disciplinary differences 


The most important resource for research are researchers. Table 1 shows some 
striking differences in the size, composition and funding of research groups 
between the fields. Both fields from the natural sciences by far exceed the size 
of the typical research group in microeconomics. Dispersion of size is high in 
all fields and particularly in the natural sciences. Groups from universities 
tend to be smaller than those from non-university research institutes. In 
microeconomics, professors account for about one third of the manpower of a 
group, while in nanotechnology they represent just 7%. 


Tablel Structure of staff 


Astrophysics Nanotechnology Microeconomics 


Mean Std Mean Std Mean Std 


# researchers 13.4 11.5 13.8 13.5 4.1 2.4 
% internally funded 48.1% 5.8 31.5% 4.2 85.1% 1.9 
% funded by third stream 51.5% 9.1 63.6% 8.9 14.4% 0.8 
% of postdocs 67.9% 7.5 48.8% 8.9 43.4% 1:7 
% of C3/C4 94% 0.8 6.9% 0.6 30.8% 1.0 
% of doctoral students 51.7% 6.6 50.6% 6.8 67.9% 2.0 
% of research students funded 9.3% 2.7 4.3% 1.0 13.9% 0.9 
# technicians 2.2 2.8 2.9 2.4 0.3 0.5 
# disciplines in group 1.4 0.6 2.1 1.1 1.6 0.7 
Valid cases listwise 22 22 22 


In line with the idea of a mode-2 field, nanotechnology groups fund 64% of 
the group’s manpower by external funds. The difference to astrophysics as a 
natural science mode-1 field is not that high (52% ), only on the edge of being 
significant. The funding structure of the microeconomics groups is completely 


4 The data on the population institutions and the bibliometric data on the research 
groups studied in depth were provided by the central data project, ISI Karlsruhe. Thanks 
to Ulrich Schmoch, Sybille Hinze and Torben Schubert. 
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different. With almost one third of manpower at the professorial level the 
degree of internal funding is much higher (p<0.01). Disciplinary hetero- 
geneity in the groups is highest in nanotechnology. 

Time for research is lowest in microeconomics with a large amount of 
professorial group members who devote a lot of time to teaching duties. 
Percentage of time for research is much higher in both natural science fields 
with a higher amount of externally funded personnel explicitly hired for 
research projects. Percentage of time for project acquisition is largest in 
nanotechnology (13.4%) with the largest amount of externally funded 
researchers. Time for research is getting scarcer and time for teaching and 
administrative duties increased in all fields. 


Table 2 Allocation of work time of the group 


Astrophysics Nanotechnology Microeconomics 


Mean Std Mean Std Mean Std 


% time for research 60.4% 17.6 55.9% 176 40.5% 17.0 
% time for teaching 19.8% 11.9 186% 106 341% 15.0 
% time for project acquisition 10.2% 5.7 13.4% 8.4 7.9% 5.6 
% time for other work 95% 9.8 12.0% 7.8 17.0% 10.7 
Change in time for research 2.7 0.8 2.7 0.8 2.6 0.8 
Change in time for teaching 3.3 0.8 3.7 0.6 3.6 0.7 
Change in time for project acquisition 3.3 0.8 3.5 0.8 3.0 0.8 
Change in time for other work 3.6 08 41 0.8 3.8 1.0 
Valid cases listwise 23 20 21 


Change: 1 = much reduced, 5 = much increased 


In all fields groups invest by far most of their research time into basic 
research projects. Astrophysics qualifies as a typical mode-1 field with very 
few applied research, nanotechnology groups devote a fifth of their capacity 
to applied research. The high amount of applied research in microeconomics 
(25%) reflects the work of groups applying microeconomic analysis to envi- 
ronmental, innovation and agricultural problems. It comes at some surprise 
that there is hardly a difference between nanotechnology and astrophysics 
concerning the amount of development work (13-15%). Interview data show 
that this work in both fields deals mostly with the building of new research 
equipment. 

Output indicators show a large amount of dispersion. This is partly due to 
the large differences in group size. The large standard deviations reflect the 
typical evidence of highly skewed distributions of research output (Lotka 
1926, Price 1976). While publications in national journals still have some 
relevance for microeconomics, astrophysics and nanotechnology groups 
exclusively publish in international journal. For all fields publication in 
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Table3 Research orientation and type of funding 


Astrophysics Nanotechnology Microeconomics 


Mean Std Mean Std Mean Std 


% basic research 84.4% 19.1 63.9% 26.3 743% 31.0 
% applied research 25% 53 211% 218 257% 31.0 
% development 13.1% 16.9 15.0% 12.4 


% of third stream funded research 59.6% 23.5 70.7% 30.6 20.2% 27.0 
out of this 

% funded by science foundations 572% 332 582% 31.3 76.8% 36.5 
% funded by public/ state institutions 37.0% 33.6 26.6% 23.6 18.2% 33.6 


% funded by industry 0.6% 22 86% 11.4 5.0% 9.2 
% other funding 52% 147 71% 180 0.0% 0.0 
Valid cases listwise 23 21 23 


Table4 Output indicators — self report 


Time period 2002-2003 Astrophysics Nanotechnology Microeconomics 


Mean Std Mean Std Mean Std 


Papers in international refereed 4.3 43.0 39.0 64.6 7.0 8.5 
journals 

Papers in national refereed journals 0.0 0.0 0.2 0.7 1.4 1.7 
Conference papers 48.0 62.3 34.8 48.4 93 14.7 
Papers international/ researcher 3.8 23 2.8 1.6 2.0 3.0 
Papers national/ researcher 0.0 0.0 0.0 0.1 0.4 0.5 
Conference papers/ researcher 4.0 31 21 1.3 2.3 2.9 
Valid cases listwise 23 20 22 


international journals is the most important output. Even in nanotechnology 
patents are not of countable relevance. 

The differences between the fields shrink a lot when we control for number 
of researchers. The number of international journal papers per researcher and 
conference papers per researcher is about twice as large in astrophysics 
compared to microeconomics. Nanotechnology is in between with regard to 
international papers and at the same level as microeconomics in conference 
papers. The lower productivity of microeconomics is probably due to the fact 
that these groups devote less of their time to (externally funded) research than 
both natural science fields. 

In table 5 self reported data from the standardized questionnaire and the 
self assessment data from the interviews are compared to bibliometric data. 
Nanotechnology groups are slightly more productive according to biblio- 
metric data, the profile of astrophysics is slightly lower. Microeconomics ranks 
third both in bibliometric indicators, self reports and assessments. 
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Table 5 Comparison of questionnaire and bibliometric output data 


Time period 2001-2002 Astrophysics | Nanotechnology Microeconomics 


Mean Std n Mean Std n Mean Std n 


Papers in international refereed 444 43.0 23 39.0 646 20 7.0 85 22 
journals, self report 

Papers in international refereed 372 46.1 25455 65.8 27 0.7 0.9 25 
journals, bibliometric data 

Citations 84 72 25 49 45 26 02 71 25 
Self assessment 1.67 0.92 24 195 0.86 21 2.95 1.23 20 


Self assessment: 1 = international top group, 5 = not so strong 


3.2 Network structures and networks strategies 


Data on the ego networks, i.e. focused networks around the research group, 
were collected with the help of a standardized inventory. Interviewees were 
asked to name all those actors (so called “alteri”) with whom they collaborate 
in joint projects. For up to 20 mentioned ties structural data (i.e., whether the 
alteri also know each other) and attributes of the alteri were collected. 

Corresponding to the differences in group size, large networks are much 
more common in the natural sciences than in microeconomics. The difference 
is even larger when we look at the size of gross networks instead of the smaller 
networks described structurally in table 6 (see p. 220). Nanotechnology com- 
mands the largest networks (28.6) followed by astrophysics (24.8) and 
microeconomics (13.4). The structurally described networks in the three fields 
are of similar density (0.41-0.43). Almost half of the possible ties between 
alteri do exist. As a young field, nanotechnology is characterized by rather 
young ties compared to astrophysics. Mean tie strength does not make much 
of a difference between the fields. It is slightly lower in nanotechnology. 
Astrophysics is the most international field. It comes as no surprise that 
nanotechnology is the only field with a relevant percentage of ties to industry 
— albeit almost ninety percent of ties relate them to other academic groups. 
Thus academy still seems to have a large say in research questions. 

Concerning the hypotheses on network strategies, we need more qualitative 
information on how groups perceive their networks, how they use them and 
what the driving forces of network formation and change are. Data on the 
change in networks will be collected in follow-up interviews scheduled for 
2006 and 2008. In the first interviews qualitative information on the origins of 
research projects and research networks and on the strategies connected to 
them were collected. A qualitative content analysis yielded several non- 
exclusive factors which were coded as multiresponse variables for statistical 
analysis. 
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Table 6 Structure of research networks 


Astrophysics Nanotechnology Microeconomics 


Mean Std n Mean Std n Mean Std n 


Size* 10.080 3.174 25 11.300 4.027 27 7240 5.797 25 
Network density* 0.427 0.235 25 0412 0.236 27 0.430 0.226 23 
Mean duration of ties* 9.674 4.076 21 6811 3.487 20 7742 4379 22 
Mean tie strength* ** 1.533 0.226 25 1.466 0.231 27 1.533 0.284 23 
% of international ties* 0.614 0.233 25 0.381 0.249 27 0.398 0.302 23 
% of industry ties* 0.000 0.000 25 0.107 0.154 27 0.032 0.088 23 


* Detailed ego-network data 
** | = weak, 2 = strong 


The most frequent motivation given for a project is that it emerged from 
path dependence naturally. More than three quarters of the respondents in 
astrophysics and microeconomics and more than half of the nanotechnologists 
attributed the origin of their projects to path dependency. Scientific relevance 
often goes together with path dependence reasoning. It is much more 
important for astrophysics and microeconomics than for nanotechnologists. 
On the other hand application options are relevant only for nanotechnologists. 
More than a third of them reported that projects are strategically fitted to 
external funding programs. This was reported by a fifth of astrophysics groups 
and only by one in eight microeconomics groups. 


Table 7 Origins of research projects and of networks 


Astrophysics Nanotechnology Microeconomics 


Mean Std n Mean Std n Mean Std n 


% Projects: path dependent 0.84 0.28 25 0.52 0.51 25 0.76 0.44 25 
% Projects: scientific relevance 0.32 0.48 25 0.04 0.20 25 0.28 0.46 25 
% Projects: application relevance 0.00 0.00 25 0.40 0.50 25 0.04 0.20 25 
% Projects: external incentives 0.20 0.41 25 0.36 0.49 25 0.12 0.33 25 
% NW: path dependent 0.92 0.28 25 0.72 0.46 25 0.96 0.20 25 
% NW: strategic open choice of 0.68 0.46 25 0.80 0.41 25 0.16 0.37 25 
partners 


% NW: effect of external incentives 0.24 0.44 25 0.20 0.41 25 0.08 0.28 25 


The differences between the fields concerning the reasoning on the emer- 
gence of networks are quite similar. Path dependence is less important for 
nanotechnology groups. They are more prone than astrophysics groups to 
choose new partners with a strategic and open perspective. Microeconomics 
groups make hardly use of an open strategic choice of new partners. They also 
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refer very seldom to external incentives for the establishment of their net- 
works. External incentives for the establishment of their networks do have 
some effect in both natural science fields. 

There is a striking difference in the role of external funds for the research 
opportunities between the fields. External resources were not a subject at all 
in many interviews in microeconomics. Obviously the much lower infra- 
structural and personnel requirements of microeconomic research make 
external resources and group size not a problem for this field. On the other 
hand, for more than two thirds of the natural science groups external funding 
is a condition sine qua non for doing research. 

Networks are a strategic asset for the research capacity of most of the 
natural science groups. Instead, microeconomics has a lower but more divided 
view on networks. The function of networks is to provide for complementary 
resources, skills and knowledge. Critical mass reasoning is less important, 
most common in nanotechnology. Specialization is strongest in astrophysics. 
More than half of astrophysics groups specialize in both, subject and methods. 
Modular arrangements of projects or conduction of several unconnected 
projects are the most common strategies for nanotechnologists. Micro- 
economics seems to be divided between the two ends of the scale. A large 
group specializes in subject and methods, and another one conducts diverse 
unconnected projects. 


Table 8 Relevance of external resources, networks and types of specialization 


Astrophysics | Nanotechnology Microeconomics 


Mean Std n Mean Std n Mean Std n 


% research capacity depends on 0.72 0.48 25 0.68 0.48 25 0.12 0.33 25 
external resources 

% external resources provide nice 0.28 0.58 25 0.16 0.37 25 0.16 0.37 25 
adds 

% smallness of group isa problem 0.16 0.37 25 0.08 0.28 25 0.00 0.00 25 
NW importance* 2.72 046 25 2.88 0.33 25 2.24 0.83 25 
% NW function: complementary 0.84 0.37 25 1.00 0.00 25 0.76 0.44 25 
resources 


% NW function: assembling 0.12 0.33 25 0.16 0.37 25 0.08 0.28 25 
critical mass 

% low specialization, hetero- 0.12 0.33 25 0.24 044 25 040 0.50 25 
genous projects 

% modular arrangement of 020 0.41 25 0.36 0.49 25 0.08 0.28 25 
projects and capacities 

% specialization in methods 0.16 0.37 25 0.20 041 25 0.36 0.49 25 


% specialization in subject and 052 0.51 25 0.20 041 25 0.24 0.44 25 
method 


* Importance of networks: 1 = low, 3 = essential 
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4 Evaluation of hypotheses 
4.1 Bivariate correlation analysis 


As an indicator of performance the self reports on the number of international 
papers in refereed journals are used here. Correlation with the bibliometric 
indicator is r=0.897 (p=0.000). Since a split of the data for a discipline 
specific analysis is not possible, in the bivariate analysis the dependent vari- 
able is transformed to z-values using discipline specific means and standard 
deviations. 

Despite small sample size, some hypotheses from chapter 2.5 can be cor- 
roborated by the data (see table 9). The size of the network - the overall gross 
size (r=0.40, p < 0.01) and the size of the structurally described network (r= 
0.34, p < 0.01) - correlates strongly with discipline specific performance (H1). 
Some of the central variables of the mode-2 thesis, the percentage of industry 


Table 9 Correlation matrix: Bivariate analysis and regression variables 


Bivariate pearson correlations 


1 2 3 4 5 6 7 
1 Ln (paper intern. journals) 
2 Z-transformation per discipline er 
3 # researchers DORs ‚Er rr 
4 % externally funded research 31** —.20 
5 % applied research — .29** — 08 — .08 
6 # of disciplines 24** PR E vi 22* 
7 Size of gross network DO's .40*** 69%" wit — 13 K Viii 
8 Size of described network A8** 34 Ee .26** — .03 ar .13 
9 % industry ties oe .26** 38t 12 .30* 46** 30% 
10 (% industry ties)? 22% 22% B2r* .09 .27* 43%* .13 
11 % international ties .14 0.08 — 16 = 02 — 44** — 09 22 
12 Density of network .03 .05 .24* — .04 —.25* —.04 .41** 
13 Strength of ties .06 .14 10 — .06 .02 — 21 —.10 .00 
14 Duration of ties .04 — .06 14 .06 — .25 — .04 93% 16 
15 Importance of networks ie ia .18 .20 49%**  — 01 218 DOr .34** 
16 NW origin: path dependent — .24**  — .18* — .21* —.17 —.21* —.07 — .32*** .20* 
17 NW origin: open choice of partners 38t** .08 aie Be ii .06 ll 22" .30** 
18 % low specialization — .07 .09 AL .10 .10 14 — .08 22 
19 % specialization subject & methods 11 .09 — .04 — 16 — 16 .05 .14 .16 
20 field 1 astrophysics Otte .00 .08 .19 —.42** — 30** .08 .07 
21 field 2 nanotechnology 308 .00 22 Agr .14 42** .24** MA aa 
22 Non-university group Agere BER 31*** —0.18 — 0.12 0.17 ete 24** 


* p = 0.10 2-sided 


** p= 0.05 2-sided 
*** p = 0.01 2-sided 


n pairwise between 72 and 76, except duration of ties: n = 62 
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ties (H4: r=0.26, p<0.05) and the number of disciplines in the group 
(r=0.23, p<0.05) are positively correlated to performance. However, a 
positive effect of the amount of applied research cannot be corroborated (H3: 
r=0.08). 

Concerning indicators of network structure, the only significant effect is the 
percentage of industry ties (see above, H4). Neither density of networks nor 
strength or duration of ties displays a significant relation to performance. This 
sheds some doubts on a purely structural approach to networks (chapter 2.2) 
and underlines the necessity to look at the learning and network strategies of 
the actors (chapter 2.3). 

Concerning the qualitative aspect of network strategies (H7) we find a 
relevant negative correlation of a path dependent “non-strategy” in network 
formation with performance (H7: r= -— 0.18, p<0.10), but no effect of a 
strategic open choice of network partners or of young ties. Albeit open 
partner choice does correlate with the size of collaboration networks (gross 


Table 9 (cont.) 


Bivariate pearson correlations 


10 11 12 3 14 15 16 17 18 19 20 21 
.97** 
.28* — .25* 
02 02 — .05 
AT —.17 AT — 19 
.18 — 14 18 30* —.02 
41 A1 12 .06 —.03 —.02 
13 29%* = — 13 — .05 04 —.01 — 18 
.06 — 14 — 04 —.01 04 02 38**  — .20* 
.26* 22 — .25* 19 —.02 —.23 — .23 — 13 — .09 
.19 —.11 .08 — .08 —.16 .29* .06 27er 18 — .33** 
30** — .25* .38** .01 .07 27* 12 Bi 19 — 22 .30** 
39%** 35** —.22 — .03 —.13 —-.21 30** — .31*** 36** — 02 — 18 — .51** 
09 12 .18 — 02 .03 .06 AS — 15 .26** —0.11 .28** .25** 0.08 
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networks: r=0.22, p < 0.10; detailed networks: r=0.30, p < 0.05). On the other 
hand, a path dependent choice is correlated negatively with both measures of 
network size. Strategic network behavior in turn is correlated positively with 
the size of a group (r=0.31, p<0.05) and the amount of externally funded 
research (r=0.51, p < 0.05). 

Concerning hypotheses 8 to 10, they are only partly corroborated by 
bivariate analysis. Neither specialization nor heterogeneity of project strategy 
has a significant effect on performance (H8). Specialization leads to larger 
networks, but the effect is not significant. We observe a relevant negative 
correlation between the percentage of applied projects and specialization 
(r=— 0.16), albeit again not significant. A heterogeneous project strategy 
leads to a larger amount of industry ties (r=0.26, p<0.10) and to smaller 
networks (H9: r= — 0.22, p < 0.10). 

Of greater statistical relevance are the hypotheses on potential advantages 
of groups from their embeddedness into larger established non-university 
research organizations. Non-university groups tend to be significantly larger 
(23.78 vs. 9.41). Most of them come from astrophysics, only few from micro- 
economics. Non-university groups perform significantly better (H13: r=0.37, 
p=0.001). They command larger networks (H11: r=0.32, p = 0.005) and they 
are more often specialized than university groups (H11: r=0.28, p= 0.014). At 
the same time they are less dependent on external funding (r= — 0.18, p= 
0.124). 

As a preliminary conclusion, I hold that networks and particularly network 
size have a strong effect on scientific performance. Heterogeneity of networks, 
particularly industry ties, can have a positive effect on performance. However, 
applied research and a heterogeneous research strategy do not have positive 
effects on performance. A strategic attitude to networks in contrast to a path 
dependent attitude has a positive effect on the size of networks. Strategic 
network behavior leads to and/ or tends to be supported by group size and 
larger amounts of external research money. It comes at some surprise that 
specialization is not correlated with performance. The options for special- 
ization seem to be better in the context of non-university research organ- 
izations. Groups in these contexts perform significantly better. They are larger 
and can attract larger networks. At the same time, they are less dependent on 
external funding. Focused funding is acknowledged as relevant for project 
choice in the two natural science fields, mostly from university groups. 
Whether such resource constraints indeed do lead to less specialization of 
university groups is a research question that needs further attention. 
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4.2 Preliminary regression models 


Small numbers and skewed distributions such as performance data pose 
severe problems for a thorough regression analysis. The current data file 
therefore will be enlarged soon in order to come to better founded con- 
clusions. The large differences between the disciplines and their production 
logics ask for a discipline specific analysis, which unfortunately is impossible 
with the small data set available now. 

What is presented here is a linear regression analysis on the dependent 
variable self reported number of papers in international refereed journals. For 
reasons of improved fit to assumptions on the distribution of residuals, the 
numbers have been transformed to their natural logarithm. Two dummy 
variables are introduced to represent the field differences. I start from a 
baseline model with the two field variables and the number of researchers. 
Packages of variables concerning the hypotheses from the four theory strands 
(see chapter 2) are introduced then. 

In table 10 (see p. 226), beta coefficients (left column) and respective p- 
values (right column) are reported (see table 9 for a correlation matrix). 
Because of the small case numbers, strong significance cannot be expected 
even for sizable beta coefficients. 

Model 1 is the baseline model. The differences between the fields are highly 
significant. Performance compared to the baseline field (microeconomics) is 
higher for astrophysics (beta = 0.613) and nanotechnology (0.529). Difference 
in group size (# of researchers, beta = 0.430) also accounts for a sizable part of 
the adjusted explained variance of 0.627. 

Some of the strong effects of size and fields can be captured by other 
variables. However, given the high percentage of explained variance reached 
by the basic model, R? increases only very modestly. Model 2 tests for the 
effects of the mode-2 variables. The number of disciplines is not a relevant 
factor, but the percentage of industry ties and the percentage of applied 
research seem to have some effect, despite a lack of significance. As on the 
bivariate analysis, the percentage of industry ties increases the productivity of 
the group (beta = 0.125) as long as it only maintains a small work time budget 
to applied research (beta = — 0.135). This could imply that industry ties are 
only scientifically productive if the group keeps a strong footing in basic 
research. 

Model 3 introduces the structural network variables (H4 to H6). This is the 
model with the highest percentage of explained variance (R? adjusted = 0.672, 
p=0.000). Field differences seem partly to be captured by network variables 
such as size of network (beta = 0.17, p = 0.102), the percentage of industry ties 
(beta =0.201, p=0.552) and of international ties (beta =0.128, p=0.161). 
Since the quadratic term of industry ties is (not significantly) negative, the 
question whether there is a curvilinear relation to performance needs further 
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Table 10 Regressions analysis: Dependent variable = Natural logarithm of self 
reported number of publications in international journals (OLS regression) 


Regression Analysis: Dependent variable LN (# papers international journals) 


Model 1 Model 2 Model 3 Model 4 


n=72 n= 68 n=59 n= 68 
constant 0.000 0.000 0.841 0.393 
# researchers 0.430 0.000 0.393 0.000 0.408 0.000 0.370 0.000 
% applied research -0.135 0.130 
# of disciplines 0.046 0.619 
Size of network 0.170 0.102 0.246 0.006 
% industry ties 0.125 0.194 0.201 0.556 0.080 0.411 
(% industry ties)**2 -0.100 0.759 
% international ties 0.128 0.161 
Density of network 0.046 0.643 
Strength of ties 0.057 0.499 
Duration of ties -0.069 0.434 
Importance of networks 0.056 0.510 
NW strategic open choice of -0.090 0.353 
partners 
% low specialization 0.047 0.588 
% specialization subject & 0.046 0.576 
methods 
field 1 astrophysics 0.613 0.000 0.590 0.000 0.498 0.000 0.597 0.000 
field 2 nanotechnology 0.529 0.000 0.439 0.000 0.394 0.000 0.405 0.001 
R? 0.643 0.655 0.729 0.696 
R? adjusted 0.627 0.621 0.672 0.648 
p 0.000 0.000 0.000 0.000 


observation. Structural variables (density, tie strength, and duration) are again 
irrelevant. 

Model 4 retains the variables size of network and industry ties from model 3 
and adds some variables on network and research strategies of the groups. The 
model explains 69,6% of the variance, R? adjusted is 0.648. Size of network is 
the only variable with a significant effect. Obviously some of the group size’s 
effect is taken up by it. Thus, while small groups might need networks the 
most, it is the larger groups who command large networks. Strategic network 
orientation and importance of networks, which are strongly correlated with 
performance as well, display no effect when introduced together with network 
size and size of research group. 

In order to be aware of the drawbacks of a joint analysis of the fields, a 
comparison of the correlations of the two types of performance variable, the 
discipline specific transformation and the logarithmic transformation, is of 
some help (see table 9 p. 222/223). There are some striking differences: the 
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amount of external funding changes sign when we regard the discipline spe- 
cific standardization. The effect of network size stays positive and significant, 
but is reduced. The negative effect of a path dependent network strategy holds 
for both indicators. However, indicators of networks’ strategic importance and 
strategic network building loose much of their positive effect. This could mean 
that research resources such as external research money and large networks 
take on a very different role in the three fields studied here. Further analysis 
on the basis of a larger data set and longitudinal data will have to disentangle 
these effects in detail. 


5 Conclusion 


The preliminary analysis presented here makes clear that networks are an 
important factor in scientific performance. In particular, the size of networks 
has a relevant and significant effect on performance. On the basis of this small 
and heterogeneous sample, a relevant effect of structural network attributes, 
which are focused by the different strands of social capital theory, could not be 
found. Instead, an actor oriented approach focusing on the network strategies 
of a research group seems to be promising. A path-dependent “non-strategy” 
in network formation not only prevents groups from attracting sizable net- 
works but also has a negative effect on scientific performance. Small groups 
have a tendency to build their networks in a path dependent way, while larger 
groups tend to choose a more strategic view on networks. What is cause and 
effect here is hard to say from cross-sectional data. Follow-up interviews are 
planned to disentangle these relations. 

While networks are important, they seem to have quite a different role in 
the subfields studied. They may have a strong role in attracting external 
research money, which can be used to enlarge one’s group. But there are 
disciplines such as microeconomics which are not really dependent on exter- 
nal money and large group sizes. A disciplinary split of model 1 - despite the 
very small numbers — yielded no effect of group size on performance in 
microeconomics, but strong significant effects in the natural science fields. A 
thorough analysis of the economies of group size, network size and amount of 
external funding is scheduled for the next year when the larger data sets for 
the fields are available. 

As for the mode-2 thesis, it can be stated that nanotechnology indeed dis- 
plays some of the mode-2 attributes. But except for the stronger orientation 
towards applied projects the differences to the other fields are not really 
striking. Nanotechnology groups have on average 10% industry ties, but still 
90% academic ties in their networks. Concerning the relation to performance, 
the picture is still ambivalent. A large amount of applied research and het- 
erogeneous projects seem to hamper research performance, while industry ties 
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as such have a positive effect. Here we need to follow up the question of 
whether there is an upper threshold of the amount of industry ties, which 
might lead to an inverted u-shape of its effect on performance. 
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Networking in Science and Policy Interventions 


Comment by 


HENRIK EGBERT 


Dorothea Jansen’s paper deals with the influence of networking and social 
capital on the production of knowledge. Jansen investigates this topic for 
Germany, where reforms of the mainly publicly financed science sector pro- 
mote network building among scientists. The major questions are related to 
the effects of ego-centred networks of scientists and the output scientists 
produce (chiefly publications). The study is part of a larger research program 
on scientific networks in Germany. The focus is on three disciplines, i.e., 
astrophysics, nanotechnology and microeconomics. The paper combines 
sociological theory and empirical data, addressing topics that are of relevance 
because the results may allow an evaluation of some of the reforms in the 
German scientific system. 

The paper consists of two parts. In the first part, theories about the role of 
networks in science together with the research hypotheses are outlined. In the 
second part, Jansen presents preliminary results from an investigation of ego- 
centred networks. Data include 25 research groups from each of the three 
fields. Jansen tries to answer two questions. The first focuses on the effects of 
heterogeneous networks on output and the second addresses the effects of an 
applied research strategy on output. She analyzes existing networks with the 
standard and well established sociological toolkit and tests several hypotheses 
that may explain positive network effects. 

The prevalent assumptions throughout the paper suggest that networking 
(and social capital) generally yields positive effects. The idea is that network 
structures have inherent resources; if individuals collaborate, positive network 
effects come into existence. These may then foster the production and the 
spread of knowledge. The idea of the positive network effects has been taken 
up by politics. Politicians and research funding institutions nowadays imple- 
ment reforms which favor a high degree of networking (e.g., the Sixth 
Framework Programme of the EU). One consequence is that resources are 
redistributed in favor of scientists who are active in networking. It seems that 
politicians tend to think that networks can cure the shortcomings in the 
German science system. Thus, the well-known statement of Portes (1998, 2) 
that “social capital has evolved into something of a cure-all for the maladies 
affecting societies” seems to apply also for network approaches. 

For this reason, I deal with an aspect left out in the paper, the influence of 
policy reforms on scientists’ decisions and science output. I assume that 
individuals are ‘embedded’, i.e., institutions matter, and that they behave 
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rationally, i.e., they adjust their behavior to changing incentives. I start with an 
environment for scientists where no externally-set incentives for networking 
exist. After that I consider the political interventions in this environment. 
Within these scenarios, I point out some aspects that could be of interest in the 
context of Jansen’s research.! 

Let us imagine an environment for scientists in which there is no external 
intervention that promotes or punishes collaboration in research. Let us fur- 
ther assume that scientists behave rationally by publishing in recognized 
journals. The more they publish, the better. For the production of publications, 
they rely on inputs from other scientists. For instance, scientists use published 
articles from other scientists to develop their own research ideas. However, 
using published articles or patented ideas is not cost-free. Scientists can reduce 
transaction costs if they decide to collaborate with each other in the exchange 
of knowledge so that information can flow faster. Therefore, collaborating 
units — we may also call them networks — come into existence. Moreover, each 
scientist within a network can specialize.” The gains from such a division of 
labor are shared among network members (see, e.g., Beaver 2001) and a 
scientist participates in the collaboration as long as individual gains are pos- 
itive. In this kind of “self-organising networks” (Wagner and Leydesdorff 
2005, 1608), the exchange of knowledge is faster and cheaper than between 
networks. Transaction costs are low because low entry and exit barriers exist. 
Presumably such networks have a rather informal character. Assuming that 
scientists are free to choose with whom and with what intensity to collaborate, 
it can be argued that for scientists networks are a means of achieving 
individual objectives.* Therefore, networks will have the structure, the density 
of ties and the size that suit their members best. Obviously, networks prove to 
be rather heterogeneous with respect to these variables, as the data from 
different networks and research fields show. 

Now let us assume that the environment changes through a policy inter- 
vention which rewards scientists who work in networks. I assume further that 
the total budget for science remains constant. Thus, the policy intervention 
leads to a redistribution of public resources in favor of those scientists who 
work in networks (or claim to do so). This change in the institutional setting 
provides incentives to react to. In order to look at the effects, I distinguish two 
types of scientists. 

The first type does not engage in networking before the policy changes. 
These scientists now may need to become members of networks otherwise 


! This comment can be understood as a contribution to the ongoing discussion among 
sociologists and economists on networks. For a recent debate, see Rauch and Casella 
(2001) and Zuckerman (2003); for a criticism of social capital and network concepts in a 
specific context see Egbert (2006). 

2 Cf. Walstad (2002, 14-15) for specialization and exchange in science. 

3 Cf. Melin (2000, 34) for reasons to collaborate. 
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they lose resources. In extreme cases, collaboration is not anymore a question 
of whether it is useful for the production of output; it is a precondition for 
receiving funds. Scientists of this type have to devote more time to networking 
activities. They may either try to enter existing networks or decide to set up 
new ones. In both cases, individual efforts are spent on network building 
instead of the production of output. Consequently, the policy intervention is 
most likely to have negative effects on output. 

The second type is actively engaged in collaborations before the policy 
intervention. One effect of the new incentive could be that collaborations 
become more formalized because of two reasons. First, other scientists wish to 
enter the networks in order to get part of the available budget. For those 
scientists who are already in the network, it could be reasonable to set up 
entry barriers for newcomers. Second, funding is related to formality; it is 
difficult for informal networks to receive funds. With formalization, trans- 
action costs rise.* This may cause efficiency losses as compared to a situation 
of informal collaboration. Another effect is that an increase in the network 
size also increases the probability of conflicts among members, thus leading to 
efficiency losses. It is not at all clear whether a policy intervention which 
promotes networking increases the production of knowledge. There are 
plausible reasons to believe that the opposite effect is possible.’ 

To sum up, collaboration among scientists evolves naturally. Networks show 
different structures reflecting the aims of the participants. As long as these 
collaborations are voluntary, they can be considered as efficient arrangements 
for the production of scientific output. If the incentives for collaboration 
change, then the structures of networks (size, density, structure, etc.) change as 
well. Networks that were efficient before the policy intervention are unlikely 
to be efficient after the policy intervention. For this reason, it cannot be 
argued that the observation of a correlation between network variables and 
output measures provides a justification for a policy that influences these 
variables. Such a policy can only be justified if it can be shown that, without it, 
network formation is inefficient. 

Jansen certainly contributes to the knowledge on networks in the German 
science sector by describing particular networks. Her case study provides 
detailed information on ego-centred network structures. However, it is diffi- 
cult to understand how descriptions of existing networks will help to evaluate 
policy interventions in science. Networks are constantly formed and reshaped 
by individual decisions. If one aims at an evaluation of policy reforms that 
favor networking in science, one needs a theory on network formation. Since 
Jansen does not refer to such a theory, important questions remain unan- 


4 The transaction costs of formal networks may even be so high that scientists decide 
not to collaborate. 
5 See also Cowan and Jonard (2003) for negative effects of growing networks. 
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swered. To mention but some of the prominent ones: What are the general 
patterns (according to the optimal size and optimal density) of efficient net- 
works? How do size, density, and structure of networks fluctuate when the 
resources available for the network increase or decrease?° What is the optimal 
distribution of resources within a network? A more thorough insight into the 
matter would make it necessary to combine the descriptive material presented 
by Jansen with a theory that allows predictions about individual behavior 
upon changes in the environment. 
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A Beauty Contest of Referee Processes of Economics 
Journals 


by 
CHRISTIAN SEIDL, ULRICH SCHMIDT and PETER GROSCHE* 


This paper is a concise report on an internet survey of economists to ascertain 
their satisfaction with peer review processes of publishing in economics 
journals. Other problems of peer review processes are not addressed in this 
paper. Instead we refer to the starred footnote. 


1 The Survey 


At the end of 2001 and at the beginning of 2002 we addressed some twenty 
thousands persons twice by e-mail asking them for their online responses to 
seven questions concerning their experience with referee processes of eco- 
nomics journals. Some 6,000 persons in total had a look at our questionnaire, 
but only a bit more than 10% of these individuals started to respond to it. 

As the professional institutions in the Anglo-Saxon world did not even 
respond to our inquiry, let alone did offer their cooperation, we used as many 
sources of mail addresses as possible in the hope of capturing many econo- 
mists. We had the mail addresses of the members of the European Economic 
Association, Verein für Socialpolitik, Economics Bulletin, IZA (For- 
schungsinstitut zur Zukunft der Arbeit — Institute for the Study of Labor), and 
Inomics. Hence, we could rely on some 4,500 academic economists. Many 
other addresses were those of professional people who had never published. 
Thus, an overall response rate of 3% may seem to be small, but it increases to 
some 13% if we count the “certain” academic economists only. 


* This paper was presented at the ESA/Public Choice Conference 2003 in Nashville 
and at the PET 04 Conference 2004 in Bejing. Helpful comments were received from 
Söhnke Albers, Ted Bergstrom, John Conley, Leigh Hobson, Alan Kirman, Stefan Traub, 
and Joachim Wolf. The usual disclaimer applies. A more comprehensive version of this 
paper was published in Estudios de Economia Aplicada, 23/3 (2005), 505-551. We refer 
to this source as EEA. We are indebted to the editors of Estudios de Economia Aplicada 
for their permission to reprint part of the earlier article in this volume. An even more 
comprehensive study on The Performance of Peer Review: An Interdisciplinary Report, is 
presently under elaboration. 
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2 The journals 


In our survey we used two groups of journals. We call them invited and 
contributed journals, respectively. The invited journals centred on the famous 
Diamond (1989) list (see Table 1 in EEA), which comprises 27 economics 
journals. We enlarged this list by 49 journals taken from the A and B categ- 
ories of the VSNU (Vereniging van Samenwerkende Nederlandse Uni- 
versiteiten — Association of Universities in the Netherlands) economics 
journals ranking list, which we considered as important enough to be included. 
This makes 76 invited journals. Furthermore, we solicited our respondents for 
contributing journals to their short list at their own discretion. 

This procedure yielded a total of 359 journals. Space does not permit to 
include a list of all journals in this article.' However, the structure of responses 
shows that our choice of the 76 invited journals provided a good match with 
respondents’ experience: Among the 73 journals showing at least 10 responses 
to Question 1, only four were not listed among the invited journals. All ten 
journals which attracted one hundred or more responses are also members of 
the Diamond list. All journals of the Diamond list, with the exception of the 
Brooking Papers on Economic Activity, elicited at least 11 responses to the 
first question. Moreover, the responses show that the Diamond list ignores 
some renowned (mostly non-American) journals, which existed well before 
1989. 

For the presentation of the results of our study, the data were broken down 
as follows: For the analysis of relationships between respondents’ attitudes, we 
employed all data irrespective of how many responses per journal we had. For 
descriptive documentations with respects to particular journals we arbitrarily 
settled on at least five valid responses for the respective journals. To report on 
journals with less than five valid responses would probably convey a distorted 
picture. As some respondents had chosen to drop out during the survey, the 
set of journals decreases somewhat for later posed questions. For Questions 1 
and 2, 110 journals had at least five valid responses, for Question 3 we had 107 
journals, and for Questions 4-7 we had 106 journals. 

For the purpose of this paper we prepared, moreover, a concise summary 
documentation of subjects’ responses. We narrowed down the set of papers to 
a cut-off benchmark of at least twenty valid responses to Question 7 (the last 


' The list of all 359 journals can be downloaded from our homepage http://www.wiso. 
uni-kiel.de/vwlinstitute/ifs/chair/peerreview.php as Table 1*. It contains all journals for 
which the first question was answered (respondents could answer subsequent questions 
only by passing Question 1 first). In this table, the invited journals are marked with an 
asterisk, the journals of the Diamond list among them with a diamond, and the con- 
tributed journals are unmarked. 
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question). 51 journals satisfy this condition. Table 1 (see p. 241) shows them 
ordered according to the ranking of the journals with respect to Question 7 
asking for subjects’ general satisfaction with the journals’ referee processes. 
All rank numbers are, however, taken from the more comprehensive tables. 


3 Respondents 


A survey of researchers’ experience with referee processes can follow several 
routes. One possibility is to address successful authors who managed to get 
their papers published in a journal. The other way is to address economists at 
large and thus collect also the experience of the less lucky ones which is, 
however, crucial for a valid picture of the performance of referee processes. 
Although this approach might be in danger of attracting mainly frustrated 
authors who wish to take revenge on allegedly unfair refereeing, we shall 
demonstrate below that our data do not suffer from such biases. Rather they 
are biased in the opposite direction. 

Taking the second route, we asked several institutions for e-mail addresses 
of economists. We received help from the European Economic Association, 
the Verein fiir Socialpolitik, the editorial board of the Economics Bulletin, 
from IZA and from Inomics. Our faculty colleagues Sönke Albers and Joa- 
chim Wolf also provided good advice. Some e-mail addresses of economists 
were collected by us. Several institutions, mostly from the Anglo-Saxon world, 
did not even reply to our inquiries, let alone offer us their cooperation. These 
were, in particular, The American Economic Association, The Econometric 
Society, and The Royal Economic Society, as well as some less well-known 
Asian economic associations. This refusal of cooperation implies that Amer- 
ican, Asian and Pacific economists are unfortunately underrepresented in our 
survey (see Table 2 in EEA). We had only the choice to work with the data 
available or dispensing with our endeavour at all. We decided to continue our 
analysis. 

The data of all respondents underwent a plausibility test. This led to the 
elimination of the data of 9 respondents, representing their joint responses to 
22 journals, for various reasons.” As these were typing errors, jokes, or 
attempts at manipulation,* the data of these subjects had to be eliminated. 


? The documentation of the results based on a benchmark of at least five valid 
responses can be downloaded from our homepage http://www.wiso.uni-kiel.de/ 
vwlinstitute/ifs/chair/peerreview.php as Tables 3* -8*. 

3 The elimination criteria were: Response time exceeding 250 weeks (3 respondents), 
having received more than 10 referee reports (5 respondents), and having received 
referee reports without having submitted a paper (1 respondent). 

4 We checked also computer IP addresses for similar evaluations, but did not observe 
suspicious similarities of responses in the cases of multiple uses of the same computers. 


238 Christian Seidl, Ulrich Schmidt and Peter Grösche 


Concerning the descriptive results for the particular journals (at least five 
usable responses), we were left with 630 respondents to the first question, of 
which 551 participated in the survey through to the seventh question. In the 
aggregate°, we could dispose of between 4538 (for Question 1) and 3791 data 
per question (for the entries to Questions 2-7 cf. the number of entries in 
Table 4, see p. 248). 


4 Reactions 


In addition to responses to our questions, many subjects sent us comments and 
suggestions. The tenor of their reactions was helpful, sympathetic, or critical. 

Numerous sympathetic, some of them even enthusiastic, reactions came 
from all strata of respondents. Many commentators argued that we should 
have posed more and more detailed questions. Yet, it is true that we started 
originally with a far more comprehensive list of questions, but decided to 
confine the questionnaire to but seven questions for fear of too many drop- 
outs. Our experience with this survey showed ample evidence that we were 
right in doing so: Only about eight per cent out of all persons originally 
interested in our survey embarked on responding to all questions. Other 
sympathetic scholars took the occasion of our survey to broach their own 
uneasiness with the current referee situation. 

Critical comments were received from only a few prominent economists. 
For instance, a renowned economist urged us: “Please stop sending me 
reminders about this research. Such research projects are dangerous and 
misleading!” Comments like this suggest that research directed at referee 
processes of learned journals seems not to be favored among some of the 
profession’s most prestigious scholars. 


5 The questionnaire 


When a responded connected to our server, (s)he was first presented with a 
general plea to participate in the survey. Then the respondent was shown a list 
of our 76 invited journals and asked to select those journals with which (s)he 
had experience as an author. Furthermore, the respondent was prompted to 
add further economics journals of his or her choice. Both sets together formed 
the particular respondent’s journal set. 

Then the respondent was asked the first question and asked to respond to 
subsequent items for the selected journals. For the Questions 1, 2, 3, and 7, 
(s)he was urged to respond to the respective question for all journals in his or 


5 Counting all journals irrespective of how many responses we had per journal. 
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her set. As to the first three questions (s)he could proceed only after the 
respective question had been answered for all journals in the set.° Questions 
4-6 could only be answered if the respondent had actually received a referee 
report. As some journals reject manuscripts without having solicited referee 
reports, a respondent may not have had experience with referee reports of all 
journals to which (s)he had submitted manuscripts. Therefore, for Questions 
4-6 we allowed for passing to the next question without having responded to 
the respective question for all journals. Moreover, during the response proc- 
ess, a respondent could also opt to eliminate some journals altogether from his 
or her journal set. While the questions answered up to this point were kept in 
our data, these journals were then dropped for the subsequent questions. This 
device was intended to encourage respondents to complete the questionnaire 
even if s(he) realized that (s)he had initially proposed a larger set of journals 
than s(he) was able or willing to evaluate.’ 

The questionnaire consisted of the following seven questions per journal.’ 
Question 1: “After submission of your paper, how long did it take on average to get a 
reply other than just a confirmation that your paper had been received?”? 

Question 2: “How many referee reports did you receive on average?” 

Question 3: “How many papers did you submit to this journal and how many papers 
were accepted?” 

Question 4: “Were the referee reports competent?” 

Question 5: “Did the decision of the editor match the referee report?” 

Question 6: “Were the referee reports carefully done?” 


Question 7: “How was your overall satisfaction with the procedure of paper submission 
to the respective journal?” 


Note that the responses to Questions 1-3 are numbers such as the spell to get 
a first reply, numbers of referee reports received, and numbers of papers 
submitted and accepted. In contrast to that, the responses to Questions 4-7 
result from mouse click to one out of seven fields on Likert scales. In our 
results, the worst value is coded with a 0, and the best with a 6, so that 3 forms 
the mean coded value of each Likert scale if all values were clicked with equal 
frequency. Notice, of course, that data from Likert scales are necessarily 
subjective data. An economist told us that he did not participate in our survey 
because he would have to have consulted all his files. However, authors of 
manuscripts usually decide to send a manuscript to a journal according to the 
perceptions in their memory without having consulted their files first. Mim- 
icking this behaviour, we were interested in the immediate opinions of our 


é This method warrants that subjects could concentrate on meaningful comparisons 
among the journals of their set for the same aspect of evaluation. 

7 The elimination of journals was easily accomplished. The respondent had only to 
erase a little hook after the journal. 

8 The screenshots of all seven questions can be downloaded from our homepage http:// 
www.wiso.uni-kiel.de/vwlinstitute/ifs/chair/peerreview.php. 

° In the respective cell subjects were asked to indicate the response time in weeks. 
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respondents. Using Likert scales is the proper way to capture their percep- 
tions. 


6 Results 


To have some kind of measuring rod for comparisons, we often calibrate our 
results against those journals which are commonly regarded as being the core 
economics journals. Although there exist several categorizations (e.g., Burton 
and Phimister 1995), we decided to stick to the well-known Diamond (1989) 
list. 


6.1 Descriptive results 


Recall that, for the particular journals, we confined our attention to those 
journals which commanded at least five valid responses. This reduced the set 
of journals given attention to a domain of 106 to 110 journals. The journals 
were ranked according to the given responses, where equal responses led to 
equal ranks. We used the following ranking criteria: 


Question 1: Shorter response times. 

Question 2: Greater number of referee reports. 

Question 3: Higher individual acceptance rates (papers accepted/papers submitted). 
Question 4: Higher competence of referee reports. 

Question 5: Higher matching of editorial decision and referee reports. 

Question 6: Higher carefulness of referee reports. 

Question 7: Higher overall satisfaction with referee process. 


Complete data for the present study are presented in tables which can be 
downloaded as Tables 3* —9* from http://www.wiso.uni-kiel.de/vwlinstitute/ifs/ 
chair/peerreview.php. For the concise summary documentation of subjects’ 
responses presented in Table 1 of this paper, we employed a cut-off bench- 
mark of at least twenty valid responses to Question 7. 51 journals satisfied this 
condition. In order to provide background information, we indicated in 
Table 1 the ranks R of the more comprehensive tables. As to Table 1, we had 
to settle on one ordering criterion; we used overall satisfaction (Question 7). 
As overall satisfaction is the most important characteristic, we arranged the 
columns of Table 1 in reverse order of the presentation of questions. The 
findings of our paper rest, however, on the more comprehensive results. 
Table 2 (see p. 243) provides a concise summary of descriptive results. With 
respect to response time, the Quarterly Journal of Economics stands out as the 
speediest one with a mean turn around time of 0.613 weeks, that is, 4.29 days. 
However, given its high subjective rejection rate of 93%, this means that the 
managing editor(s) of the Quarterly Journal of Economics reject(s) many of 


241 


ss IT LIST Lr 090 691 6r 90 HEO CT OLT LOC 6L ETT OOS LO LOT VE PH TT SEE TS 67 Suryurg HPOO UON f 
Orel OSTE £9 6S0 9ST TE 6E0 OCO 08 OFT 99E Or HIT ESY E€ SET tOr LT 69T SEE IS 9 ‘oq “snpuy f 
TL'LI EEST S6 990 8ST 6T SE'0 TTO 48 TWIT 80t OT SST ULY Lh EST Er 6T ELT LEE OS #8 @ SoIpMs IHJ MAMAPA 
OCT ITST S8 ELO 09T 09 LO 9S0 Or HIT 6CE 09 09T Ir 8S 9FT ISE T9 T6T LEE 6 19 BIO eg OF T 
< ES'ST p9IZ 99 SLO SST EE eco LTO 16 TILI 98E LI I9T Gr S9 99T SSE SS EFT LEE 8r ILI @ Malady og ueonowy 
S cerl 6T9T TE 190 WI 6 90 PSO Se 89T 8TE V9 ECT WS Q LIT OTE EL OTT 6E'E Le LS aoyo and 
Z CLL TEET OL ESO OST 6E wo 690 OT ZST TSE pr WI ESH 7% T9T I6E TE SST OSE Er pr sorwouoog peonrdwg 
S 6cre LYTE 001 980 ZST SE wo 8sc0 19 ET «(OTE €L WI Wr IS WT SE LE HI ISE TW 7% @ 9g yuswdopsaag T 
g OOLT 98ST 68 TWO OLT 8 LEO ZEO 9L LET LE be TT 605 ET TGT ELE 6E CGT ISE Tr v6 @ Ar1ooyL wouoog f 
= TETT 991% 19 PSO 88T LT OO PEO PL LST WYE TW ST Str 89 89T EFE pE TST 6SE 6E TM + SWouossg and f 
S OLOT LLH «Z SSO 980 98 ro 050 6r IST SSE 6 ETIT 9ES 6 8ST LTE IL 66T EFE 8E PST @ s19997 sormiouosg 
S ITI 8661 OS ISO S6T 6I 90 E0 09 89T ITE 69 IST PF ES ELT WE 6h COT 99E LE pL soruou095g 'f "PULIS 
S PELT €007 TS 880 97T L 90 wo 9 ELT POE EF BST Ist tw OLT IE OF PT 89E 9E TE HUT JUOWOT PULL 
Ww ETIT TCT 06 9S0 ILT Lb cr'0 cr'0 8S 691 90r IZT L60 Ses OL EFT We SE OT ELE pE 99 Aeyog “OF Souley) 
= 68 Il 8SOT pS SSO ll th sro sro ps ert Ise OS et 005 LE LI ITE 9 HBT ele ce LE  sommoucsg ‘f ‘peurD 
x ELEI ILbl TE 9S0 STI 08 170 150 Ly OST 69E LE LET SOS ve ETT OOF PE LLT IE OF 79 MYY “OSL OM, 
g LStI POTT 89 080 UIT €I ICO 070 88 T6T OCH CE SET 88H LE 88ST ETH LI T6T WE 6T 86 + vmuwouoog 
S 9TLT S90T SS 090 68T ST LO 190 SE WI OSE Lh PST er 9% I9T 89E Sp LOT PVE 87T 9L > eonyog I “INA 
Š S96l CIT 8S 20 Wl 1 sro PS0 P ŞT 86E ET TWT LVS SIT OLT HOH 8% 96T HFE LT IS AIOOYL “OF 
N 0671 Ot OL £90 68T + oro PLO vl EST Soe Th OTT ETS OT wT 60H ST PLT WE ST vT sormouosy uonerndog ‘f 
2 LETE PUET pL sro IST SE 60 SO OS pTI ETF 8I S80 OTS LI 6TIE IEH EI 4ST €t ET 6 KrooyL og MANA T 
A olL9l Pr 7 PLO POT 07 wo 1790 TE ET 6T pI OT Ir tS 99T 60H ET HOT 607 T ET @ ssı>wouosg f 
È ST8T LOTT 69 880 807 9I 90 8S0 6E 6ST POE vT 91 por IE Cl Loe TE LOT 607 07 €E JonuoD sstweukq ‘oq `f 
> LEEI LSSI Sr ero pOT LI 60 890 pe SST Oct TI SLT 96t © EFT 007 OE 061 ETH 6L vz 4 Ambur stwouosg 
x OTOL LOT 9E 120 TIST Le tr'o 990 8¢ oct 8LE OF 9 STI Or te OWT OTP OL OST sth LI 6p MISSFZ/ DA IOL “ISU f 
S OS'9T IL'ET 08 12.0 I tE WO wo LI Cel 9Er OL STI ZCS OT 8ST 6TH HI WT 607 OT Ze Sre}oM PNP 908 
Š 6VIL EEIT TE S90 8TI 49 OSO LO € 08T ETE 9 POT T6h PE T6T WE TS TT OTH ST 07 OH 104 “f HOOS 
ro) OL'T LOGT L6 €30 6ST 19 ECO 480 9% SET Tt ST TET 68S L EST Er IT UT Str EL vz @ sommouosg “WIR T 
> 8ST SCHI OT 790 8ST OE ero 990 L WI OLE ZE BET CTS IT WI 60t sc LOT eet CI OL NJZ/sormoucs gy ‘f 
= STIL 9SLT 6E ELO 69T OS Sro Wo ¢€ OST 6Cr 6 SST EOS ST COL Ser U ISTI Lt 8 OE 8 908 MENW 
Š 9L TEIL HI 60 EFT LS sco ELO 9 EI TSE 6 OTI Lör 8 0 6EI Mtv OT ET ESH 9 8E ATgorezue uly 
x aLs ð 3 als S wu als ð wu als ð wu als @ wa aıs ð wu IS ð al 
(49) (2) (£) () (s) (9) (L) 
ıpds ıoqumNn apy sourjdasoy »uspduoy yore ssoupnjarea, uonpejsmes u [eumof 


(sasuodsa4 pyva (QZ 1802] JW) Suyuvy uonsv/snoS Jousnor [2]goL 


Christian Seidl, Ulrich Schmidt and Peter Grösche 


242 


STIT LLO 16 650 wI OL geg STO T6 16. 687 08 CLI Lb 99 %61 197 18 TOT ETT S8 88 @ Awouoog Ponod T 
vos #rzI OT SSO OPI €L TEO OTO £6 OLIT TE 99 II PSY £9 ZFT PSE 8S SST 6LT IS 8 @ og ueu f 
Is0€ SIE ZI L'O MT 89 IFO 670 I8 WI 6TE £9 IPL FOr ZE TIT CFE 99 COT 6LT 08 pE  ® sormouoog ÁPUON f 
ÞS'EI 8SSIT S9 BLO 061 €% IPO LTO Z8 IWI OPE 6S CLT Ser IL 98T OSE 1WT 97T LL SIT 4 ‘r oruouoog 
soll ITOT ZS 980 6T €8 SPO HEO EL SLT TCE 89 IT 96t OE EST LSE ES BIZ 98T OL 67 SHSHeIS OF ma KO 
Hr 676l Sh L80 99T ZS 6F0 WO +9 OFT SNE SE OTZ BOF BL LIT OEE 89 ITZ 18T SL OE so9mosoy ueuny ‘f 
9OLT 6LST 88 090 191 6S SEO 610 06 SEIT LLE EE SOL EOF SS IST EE £9 OGI SET vL TE @ soruouovg 10qe’] f 
OLE 190 I so STI 6L 170 400 6 061 LS WC Ob 6L EOT SOE SL SOZ 067 EL 901 @ Solmouodg ‘f Ausjıend 
ESHI SOSl 9 190 QST £9 6FO TWO Z9 SOT SL OCT Br 6E MT SOE LL ITZ IT U Z ‘f orwouovg UIOYINOS 
ICHI TCUET SL 090 SLI t LPO Tro £9 9L1 SL 8I £9r LS 161 80E 9 “I 167 IL 7 4 BoIıwoUodg 
9991 SOST PS 190 SET 87 IFO IEO LL 9ST S9 Ol 89t OS ELT TEE 69 EST SOT OL LST  MatAcy ‘oq ueadomg 
ssol OSET 6L 880 OFZ SI 80 LO Is LST LL 681 €9%F 9S OST ENE OS ELT SET 69 07 ‘og f ospriquiey 
ÞT9I 6T €8 OLO LOT IS SEO ETO S8 LOT LE LET 657 09 WT 99E Lr LOT OE LO 09 @ sowuouoog f puey 
SESZ SPSE FOr 080 SFI 99 SKO 690 %Z 691 9S SOT OOS LZ 69T WE IS ETT HOE 99 € sorwouoog payddy ‘r 
zes LOOL + 9L0 ITT S8 MO 9€0 OL SST pS WI rs 61 Plt SSE 9S ITZ 60E £9 ES sopyAyy 
oorI SPIT Z9 990 SLT er €ErO IEO SL ZT 9L OST TER pL 6T STE OL 961 OTE 7 IS @ siodeg oF PIORXO 
SE'OE OLE TOL 190 POT 9S pro FO 19 OLT 9E TIL PUS pE EST 96E EF FOZ LIE 09 OF A1094, owey fu] 
erol STET LL ELO WI 7L €O 490 SZ gst SS POT 88t 9E 09T PIE Sh AST TEE 9S 9 uorspaq pur AıoayL 
OTETI PEST 98 sr0 I6T TZ seco ITO r8 SS'I TS SET 97 T9 8ST roe LS SLT TEE ss 09 % 53 jeuoneuioyu] f 
OTST SLLZ 6 880 68ST 9 tro TEO SL r91 6E vt L8H 8E 6ST OLE Ir LST ECE pS OL % 99 Og Jeuomeuloju] 
661 VEIT 09 COLO ELT Sh prO 9€0 69 9ST IE 09T Oh LO WI SSE PS COT PEE ES IL Hsonsmels ‘Og AA 
ais ae ds Ø xu as ð wu das Ø A ds Ø A AS Ø A dis Ø A 
(1) © (£) (p) (s) (9) (L) 
reds aaqumn apy sourjdas0y »uspduoy yey ssoupnjoreg, uondeJses u feumof 


(3uo3) Į aque 


A Beauty Contest of Referee Processes of Economics Journals 243 


Table2 Concise Summary of Descriptive Results 


Item Remarks 


Response time Median: 20.5 weeks; OJE: 0.613 weeks; 20 journals of the 26 
DIAMOND journals need more than 20 weeks. 
Number of referee Median: 1.75; only 24 out of 110 journals provided at least 2 referee 


reports reports. 

Acceptance rates Median: 0.5; 23 DIAMOND journals below 0.5. 

Competence Median: 3.641; even distribution of DIAMOND journals; only 12 
below 3. 

Matching Median: 4.826; only 7 below 4. 

Carefulness Median: 3.69; even distribution of DIAMOND journals; only 10 
below 3. 

Satisfaction Median: 3.524; 36 score 4 or better, 23 worse than 3. 


The median values are the medians of the mean values for the individual journals. 


the submitted manuscripts without ever having consulted a single referee as to 
rejection or acceptance of a paper. Indeed we registered n=91, 89, 88 
responses to Questions 4, 5, 6, but n=106 responses to Question 7 which 
means that several subjects did not receive a referee report at all. These values 
resemble those obtained for Economics Letters, for which we registered n= 
129, 125, 123 responses to Questions 4, 5, 6, but n = 154 responses to Question 
7. Given a mean response time of 14.77 weeks and a meager mean of 0.86 
referee reports, this leads us to conjecture that the decision to reject a paper 
without having sent it to a referee takes the editor of Economics Letters 
considerable time. Moreover, we cannot exclude that some authors counted a 
letter from the editor only as a true referee report. 

As compared to other disciplines, e.g., the natural sciences, economics 
journals seem to take a particularly long time to reach a decision. Hardly any 
journals decide in fewer than 10 weeks, and more than half of them need 20 
weeks and more to take a decision. 20 journals of the Diamond list (out of the 
26 remaining ones) need more than 20 weeks to make a decision. 

A mean of more than two referee reports is the exception rather than the 
rule. Among the Diamond journals, only Econometrica and Economic Inquiry 
reach a mean number of referee reports above two. These are the only Dia- 
mond journals which rank among the first twenty ranks with respect to the 
mean number of referee reports. Economics Letters ranks last among the 
journals from the Diamond list (0.86 referee reports per respondent). Recall 
that respondents might have considered the managing editors’ rejection as a 
valid referee report. 

We used subjects’ reports on total numbers accepted by and submitted to a 
respective journal to compute the journal’s individual acceptance rates. The 
data show that the more reputed journals have lower acceptance rates, which 
was to be expected. Indeed, 17 journals out of the Diamond list figure among 
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the last 25 ranks.'° Cross-disciplinary comparisons show that manuscript 
rejection rates are much higher in the humanities than in the natural scien- 
ces.!! By and large a rejection rate of more than 80% in the humanities 
contrasts with an acceptance rate of some 80% in the natural sciences. 

Competence of the referee reports is, on the whole, judged rather favorably. 
Only twelve out of 106 journals were rated below 3.0 (out of a maximum 6.0), 
among them only two Diamond journals, viz. the Journal of Political Economy 
and the Journal of Financial Economics. Seven out of the journals of the 
Diamond list (recall that the Brooking Papers dropped out) score at 4.0 or 
better. Competence of referee reports seems to be not positively correlated 
with the reputation of a journal. Neither the journals of the Diamond list nor 
of the invited journals bunch at the upper or at the lower end; they appear to 
be rather evenly distributed among the ranks. For instance, 14 journals of the 
Diamond list rank ahead of, and 12 rank behind the mean rank of 44. 

Our results show that most journals score rather well with respect to 
matching of the managing editors’ decisions with the recommendations of the 
referee reports. However, this signals a good performance of peer review if 
and only if referees’ judgements are valid. If they are just reliable,” and the 
managing editor decides blindly in accordance with them, this need not be a 
proxy for good refereeing because referee hostility or incompetence may be 
but insufficiently monitored by the editor. Attentive editors should interfere 
in the latter case, which would be reflected in lower matching scores. 
Accompanying letters to the editors may also be harsher than the referee 


10 However, there are some exceptions to this regularity. For instance, four journals of 
the Diamond list rank below 50 (out of 94 ranks). 

1 Cf., e.g., Zuckerman and Merton (1971), Lazarus (1982), Adair (1982), Hargens 
(1988; 1990). 

© For a discussion of the concepts of validity and reliability see EEA. 

13 With respect to the editorial decision to accept or reject a manuscript, voices have 
been aired which encourage editors to use their discretionary powers wisely and - if 
necessary — should not shy at overriding referees’ recommendations (Bailar 1991, 138). 
Stricker (1991, 164) disputes that good editors should behave like psychometric clerks 
who simply add up the scores that a manuscript gets from the referees. He argues that 
“good editors are not clerks. They read the manuscript, appraise the reasons reviewers 
give for their recommendations, and weigh all the information about it ...” He is paral- 
leled in this view by Glenn (1982, 212), Rodman (1970, 355-356), and Goodstein (1982, 
213). Crandall (1991) pleads along the same lines that editors should be super referees. 
He deplores that too many editors do not behave in this way. He suspects “that many 
editors do not even read the papers for which they are supposed to have editorial 
responsibility.” Scarr (1982, 54), editor of Developmental Psychology and the American 
Psychologist, has made a case for editorial responsibility. She refers editors who shirk 
their duties to one of Harry Truman’s wise insights: “If you can’t stand the heat, get out 
of the kitchen.” Yalow (1982, 244) blamed reviewer and editorial incompetence for in- 
stances as revealed by the Peters and Ceci (1982a, 1982b) experiment. Simon et al. (1986, 
270) report that only between 13 and 19% of authors’ complaints against referee reports 
were successful. 
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reports for the authors. Editorial deviation from referee recommendations 
may also be prompted by high backlogs of manuscripts which may goad 
editors’ zeal to curb the growth of the queues of papers agreed to be pub- 
lished. Such independent decisions by editors may help explain the lower 
rankings of some prestigious journals, such as the American Economic Review, 
the Journal of Political Economy, the Review of Economics and Statistics, the 
Journal of Public Economics, the Economic Journal, and the Quarterly Jour- 
nal of Economics. When an editor, because of space limits, is forced to reject 
manuscripts furnished with good referee reports, (s)he may well give in to 
favoritism. 

Concerning carefulness of the referee reports, journals scores were similar 
to their scores on competence of the referee reports. Only 10 out of 106 
journals scored less than 3, among them again the two notorious Diamond 
Journals, the Journal of Political Economy and the Journal of Financial 
Economics. Only 6 out of the 26 journals of the Diamond list (after dropping 
the Brooking Papers) scored at 4 or better than 4. As compared to com- 
petence, we observe a minor shift of the reputed journals to lower ranks: 11 
journals of the Diamond list were rated above and 15 below the mean rank. 
Some reputed journals rank among the bottom 20 of carefully done referee 
reports, viz. the European Economic Review, the Oxford Economic Papers, 
Economics Letters, the Quarterly Journal of Economics, Economica, the 
Journal of Political Economy, and the Journal of Financial Economics. 

Overall satisfaction with the whole procedure of paper submission proved 
as disappointing for the prestigious journals. Out of 106 journals, 36 scored at 
4 or better; among them only five journals were from the Diamond list. Out of 
the 106 journals, 23 scored worse than 3; among them eight journals were from 
the Diamond list, to wit, the European Economic Review, Economica, the 
Quarterly Journal of Economics, the Journal of Labor Economics, the Eco- 
nomic Journal, the Journal of Monetary Economics, the Journal of Financial 
Economics, and the Journal of Political Economy. 


6.2 Statistical Results 


6.2.1 Response Biases 

All data obtained from the subjects are, of course, subjective data. However, 
the data collected for Questions 1-3 have “objective” counterparts. In an 
attempt to correct for subjective biases, we sent a mail to the editors of all 110 
journals for which we had received at least five valid responses to Question 1 
and asked them for editorial data on the average response time to authors, the 
average number of referee reports solicited, and the average acceptance rate 
of manuscripts. Replies to these questions seemed to be easy, as editors of 
most journals are wont to keep regular statistics on these figures. Indeed, a 
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few journals do even publish them (see, for instance, The Economic Journal 
Managing Editors’ Report 2005). Editors who did not respond to our mail 
were sent a reminder. We received responses from the editors of 52 journals 
(response rate: 47.3%), among them 7 responses from the 26 Diamond jour- 
nals (response rate: 26.9%). Note that we could not check whether we 
received the true “objective” data, biased data or mere conjectures of the 
editors, but they represent an independent alternative data source which 
allowed inferences on possible biases. For the sake of a shorthand expression 
we address them as the objective data in this paper. 

Table 3 gives a concise summary of our results. Their entries are the means 
(taken over all journals for which we had data) of the ratios of the mean 
responses of the subjects and the responses of the editorial board. Table 3 
shows us that the subjective response time exceeds the objective one by some 
50%, that the subjective number of referee reports is slightly lower than the 
objective number of referee reports, and that the subjective acceptance rate 
exceeds the objective one by some 150%. 


Table 3 Response Biases: Statistics 


Subjective value divided by objective value N Min. Max. Mean STD 


Response Time 52 0:61* 4.73 1.49 0.69 
No. of Reports 52 0.53 1.40 0.90 0.16 
Accept. Rate 50 042** 8.89 2.50 1.34 


All means significant at the 1% level (two-sided). 
* Only 8 values smaller than 1. 
** Only 4 values smaller than 1. 


The most spectacular upward bias is noticed for the subjective acceptance 
rates. We may offer several explanations for that (all of which may have 
contributed to produce this result): 

1. Self-selection effect: It seems that the more successful scholars felt more 
attracted by our survey.'4 

2. Survey-selection effect: As our survey was directed to investigate authors’ 
experience with referee processes, we had asked subjects to respond only for 
those journals with which they had experienced at least one referee process. 
This rules out manuscript submissions which were rejected immediately by the 


14 Similar effects were observed by Sweitzer and Cullen (1994). They polled 209 
authors for their satisfaction with peer review processes. 67% of the AR (accept with 
revision) authors, 43% of the RR (reject but may resubmit) authors, and only 30% of the 
RO (reject outright) authors responded to their questionnaires sent to unsolicited 
authors of the Journal of Clinical Anesthesia. Higher response rates of authors whose 
papers were accepted were also observed by Garfunkel et al. (1990) for the Journal of 
Pediatrics. 
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editors without soliciting referee reports. Of course, manuscripts which were 
infused into the referee process have positive chances of being accepted, 
whereas the crude (objective) rejection rate includes also purely editorial 
rejections. 

3. Cognitive-dissonance effect: Successful events are memorized, failures 
are mentally suppressed. 

4. Trend effect: True acceptance rates fell with the lapse of time. 
Respondents who remember their submission history of papers amalgamate 
past with present experience, which, due to the influence of higher past 
acceptance rates, biased their perception of acceptance rates upwards. 


The upward bias of the response time is most probably associated with the 
upward bias of the acceptance rates. A well-established result says that it takes 
journals shorter times to reject a paper than to offer a revision of the paper. 
This is reinforced when many papers are rejected immediately by the editors 
without having been infused into referee processes. Given that the more 
successful authors were over-represented in our survey, this implies longer 
spells of response time. 

Note, therefore, that our data are biased in favor of the more successful 
authors. However, as we did not pre-select our respondents a priori from the 
set of the successful ones, this upward bias is certainly less than it would have 
been, had we addressed only people whose papers were actually accepted for 
publication. On the other hand, a survey such as ours is endangered of 
attracting frustrated respondents who wish to deal a blow to those journals 
which they consider to have treated them unfairly. The entries in Table 3 show 
that this was certainly not the case. 


6.2.2 Favoritism 

Favoritism can manifest itself in three ways, to wit, personal, institutional, and 
regional favoritism. Personal favoritism means that certain persons enjoy 
preferential treatment with respect to refereeing and/or editorial decisions. 
The literature abounds with gossip about personal favoritism, yet, in order to 
demonstrate its presence, one needs inside data on referee processes and 
editorial decisions, to which we had no access. Hence we could not study 
personal favoritism. Likewise, we could not study institutional favoritism 
because, for reasons of respondents’ anonymity, we have only a regional, not 
an institutional breakdown of data. Yet our investigation of regional favori- 
tism does not allow sensible results. Indeed, favoritism seems to manifest 


15 Cf., e.g., Ellison (2002, 955, Table 2); Omerod (2002) remonstrated the long 
response time of decision processes of the Economic Journal: In the year 2000, it took 
this journal 18 weeks to reject a paper and 28 weeks to offer a revision of a paper. See 
also The Economic Journal Managing Editors’ Report (2005, 6, Table 4). 
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itself, first and foremost, as institutional favoritism, followed by personal 
favoritism. The Coupé data (2000a; 2000b) convey some flavor of institutional 
favoritism; alas, these data rest on published manuscripts rather than on 
submitted manuscripts, as asked for by Hodgson and Rothman (1999). For 
more information see EEA. 


6.2.3 Correlation Analyses 

To identify relationships among the responses to our questions, we pooled the 
data for all journals (irrespective of the number of responses per journal), and 
combined them into a correlation matrix, Table 4. 


Table 4 Correlation Matrix of Responses to Questions 


Response Number Accept. Compe- Editorial Care- 
Time Rate tence Match fulness 
Number — .021 
4333 
Accept. Rate — .029 .147** 
4049 4049 
Competence — .112** A S P i 
3974 3974 3974 
Editorial — .056** .032* ANITE 269** 
Match 3858 3858 3858 3858 
Carefulness — .099** .276**  .314** 122## .281** 
3817 3817 3817 3817 3817 
Satisfaction — .256** 199%* -AGIT .682** .312** .684** 
3791 3791 3791 3791 3791 3791 


** Significance of correlation at the 1% level (two-sided). 
* Significance of correlation at the 5% level (two-sided). 
The lower lines in the cells denote the number of cases. 


When considering response time, we find that it is negatively correlated 
with all responses. Although longer response times may also be caused by 
more and better referee reports, the negative correlation with all responses 
suggests that longer response times seem to be more associated with editorial 
inefficiency than with more or better referee reports. 

When considering the number of referee reports, we observe a moderate, 
but positive correlation with the acceptance rate and with the qualitative 
responses. The positive correlation of the number of referee reports with the 
acceptance rates seems to be influenced by the occurrence of manuscript 
rejection without referee reports. These manuscripts have no chance of being 
accepted. Thus, whenever referee reports are solicited, the chance of 
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acceptance becomes greater than nought. The small correlation between the 
number of referee reports and the editorial match shows that referee reli- 
ability becomes a problem in case of multiple referee reports. As editors of 
economics journals are wont to reject a paper whenever a single one among 
several referee reports is somewhat critical, irrespective of how positive the 
other reviews are,!° authors perceive an editorial mismatch with referee 
suggestions. This perception seems to have caused the small correlation. 
Concerning the rest, more referee reports are associated with the perception 
of higher competence and higher carefulness, and, by that way, with higher 
overall satisfaction. 

Prima facie one might have expected that the manuscript acceptance rate 
exhibits the paramount correlation with overall satisfaction. However, while 
that correlation is substantial, it is much lower than the correlation between 
competence and overall satisfaction, and between carefulness and overall 
satisfaction. The perceptions of higher carefulness and higher competence of 
the referee reports are associated with higher acceptance rates. Concerning 
the correlation of editorial match with referees’ recommendations and sat- 
isfaction, one would, however, have expected a higher correlation. 

The highest correlation reported in Table 4 is the one between competence 
and carefulness of the referee reports. Obviously, our respondents hold that a 
referee who does competent work also does it carefully and vice versa.” Both 
qualitative responses have at the same time the paramount positive correla- 
tions with overall satisfaction with the referee process. Thus, competence and 
carefulness emerge as the most important positive features of referee proc- 
esses in authors’ perceptions. They are even more meaningful for overall 
satisfaction than the acceptance rate itself.!* This gives rise to the conjecture 
that authors accept rejection of their manuscripts more easily when it is based 
on competent and careful referee reports. And, conversely, they seem to be 


16 Cf., e.g., Zuckerman and Merton (1971, 78), Bakanic et al. (1990, 378), Hargens and 
Herting (1990, 97), Kupfersmid and Wonderly (1994, 56). In a similar sense cf. also 
Ingelfinger (1974, 687), Crandall (1982; 1991), Cole (1991), Coleman (1991, 142), and 
Eckberg (1982). 

17 Tn our instructions for response to Question 6 we used the following remark to alert 
respondents that competence and carefulness need not coincide: “Concerning question 4, 
please note that competence and carefulness may be independent.” The full set of 
questions inclusive of instructions can be downloaded from our homepage http:// 
www.wiso.uni-kiel.de/vwlinstitute/ifs/chair/peerreview.php. 

'8 This result accords with the results of Garfunkel et al. (1990), who did not find 
major differences in review evaluation among authors whose papers were accepted or 
rejected by the Journal of Pediatrics. However, our result for economics authors stands in 
remarkable contrast to the findings of Weber et al. (2002), who observed for authors of 
the Annals of Emergency Medicine that author satisfaction is associated with acceptance 
but not with review quality. 
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Figure 1 Distribution of Satisfaction for Factor Levels 


-2.16-1.86-1.55-1.25-0.95-0.64-0.34-0.04 0.27 0.57 0.87 1.18 1.48 


but moderately happy with the acceptance of their paper when it was based on 
incompetent and sloppy referee reports. 

Finally, to make use of similarities among responses, we applied a factor 
analysis. We employed a principal-component analysis using a varimax rota- 
tion with Kaiser Normalization.’ This produced a factor composed of the two 
components: carefulness and competence, each with a factor weight of 0.539, 
which means that their marginal rate of substitution is equal to — 1. We call 
this factor quality. It explains 86.114% of the variance among the two char- 
acteristics competence and carefulness. Competence and carefulness are, 
therefore, good proxies for the factor “quality”. Our analysis yielded 48 factor 
levels, which we found to be arranged in terms of 13 distinct groups. Repre- 
senting each group by its median allowed us to focus on 13 representative 
factor levels. A negative (positive) factor value means that a subject exhibits a 
less (better)-than-average evaluation of the respective journal. A factor value 
of zero corresponds to the average evaluation. 

Associating these 13 factor levels with the seven levels of overall sat- 
isfaction shows a characteristic distributional pattern: For low quality we 
observe a positively skewed distribution of satisfaction. As quality increases, 
the distribution of satisfaction becomes symmetrical, and gradually becomes 
negatively skewed as quality approaches its peak. Note that, although this 
pattern is in a way due to bunching effects inherent in categorical measure- 
ment, it is, nevertheless, rather distinctive in this case. Figure 1 shows the 
respective graph, which arranges normalized quality at the abscissa, sat- 


For the ease of calculation we shifted the Likert scales of questions 4-7 by 1 to 
Likert scales from 1 to 7. For the presentation in the figures, we stick to the scale range 
from 0 to 6. 
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isfaction at the ordinate. The vertical axis indicates the absolute frequency of 
our 3817 data points. 

Low levels of satisfaction are caused by a positively skewed distribution of 
factor values having their peak at the lowest factor level for the lowest level of 
satisfaction. For higher satisfaction levels the distribution of factor levels 
converges first to a symmetric distribution which is reached at the medium 
satisfaction level. For still higher satisfaction levels the distribution of factor 
levels assumes the shape of negatively skewed distributions. The factor dis- 
tribution for the highest satisfaction level has its peak at the highest factor 
level. 

Figure 1 depicts a mountain extending across the figure from the (— 2.16, 0) 
coordinate point to the (1.48, 6) coordinate point. The steepness of this 
mountain on both sides of its ridge is captured by the correlation coefficient 
between the quality factor and overall satisfaction. Its value is 0.735. It is 
significant (two-sided) at the 1% level. This illustrates a good explanation of 
overall satisfaction with the referee process by competence and carefulness of 
the referee reports. 

Finally, we have a look at the joint distribution of overall satisfaction and 
subjective acceptance rates. Figure 2 shows the respective graph. We observe 
negatively skewed distributions for all intervals of acceptance rates except the 
lowest acceptance rates (virtually rejections). Subjects whose papers are often 


Figure 2 Distribution of Satisfaction with Referee Reports in Terms of 
Acceptance Rates 
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Figure 3 Distribution of Satisfaction with Referee Reports in Terms of 
Acceptance Rates for Factor Levels 
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rejected are not distinctly dissatisfied. Although they are not extremely 
enthusiastic about rejection, we encounter in this pattern a reflection of the 
appreciation of careful and competent referee reports. Good-quality referee 
reports may, thus, indeed cause authors to understand rejection of their 
manuscripts. 

Figure 3 repeats this exercise for the 13 representative factor levels. For 
higher acceptance rates we observe negatively skewed distributions, while for 
the lowest acceptance rates satisfaction is rather evenly distributed with the 
exception of extreme happiness. 


3 Conclusions 


Peer review in science is a tribunal of sorts. It influences decisively personal 
advancement, research opportunities, salaries, grant-funding, promotion, and 
tenure. Peer review claims to exert quality control of manuscripts, to improve 
manuscripts, to promote innovative research, to foster dissemination of new 
research, to select projects for grant funding, to screen papers for conference 
presentation, and to serve as a means to rank researchers, journals, and 
institutions. 
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Yet journals no longer serve the function of disseminating new research. 
About three and a half decade ago, Garvey and Griffith (1971) had already 
demonstrated that the bulk of communication and dissemination of current 
research runs over informal outlets such as personal communications, tech- 
nical reports, discussion papers, and preprints. Eventual publication of a paper 
means that it had entered the archives of science, while its author had long ago 
started new research. This applies even more so in the electronic age. The 
main purpose of journal publication nowadays is to imprint a signal of quality 
on a scholar’s research. However, this requires an excellent performance of 
peer review. When peer review lacks validity, impartiality, and fairness, the 
imprint of manuscript excellence becomes dubious. 

These limitations induced us to conduct an internet questionnaire inves- 
tigation of authors of economics journals. We found much longer response 
times than what is customary in the natural sciences. The top journals had on 
average high rejection rates. While the top journals did not show particular 
differences from other journals with respects to the distribution of com- 
petence and carefulness of referee reports, they perform somewhat worse for 
overall satisfaction. Moreover, it is always the same group of some eight top 
economics journals which populate the bottom rungs in the respective rank- 
ings. 

We observed response biases among our respondents: the subjective 
response time exceeds the objective one by some 50% and the subjective 
acceptance rate exceeds the objective one by some 150%. This may be 
explained by several effects (self-selection, survey-selection, cognitive dis- 
sonance, and trend). 

A correlation analysis showed that competence and carefulness are highly 
correlated, and showed the paramount correlation with overall satisfaction, 
while the acceptance rate exhibited a smaller correlation with overall sat- 
isfaction. This suggests that the authors of economics journals have a higher 
esteem for good referee reports in comparison to a mere focus on the 
acceptance rate. In other words, they will understand a rejection of their paper 
if it is backed by well-founded reports. 

Combining competence and carefulness into a factor “quality” showed that, 
as quality increases, the distribution of satisfaction follows, first, a positively 
skewed distribution, becomes, for higher levels of quality, a symmetrical dis- 
tribution, and approaches a negatively skewed distribution for the highest 
level of quality. When juxtaposing overall satisfaction (quality) and accept- 
ance rates, we found negatively skewed distributions for all acceptance rates 
with the exception of the very lowest acceptance rates, for which the dis- 
tribution is largely uniform. This confirms that manuscript rejection is tol- 
erated provided that the referee reports are competent and carefully done. 
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What Should We Expect from Peer Review? 


Comment by 


MAX ALBERT and JÜRGEN MECKL 


Seidl, Schmidt and Grösche (henceforth, SSG) report on an internet ques- 
tionnaire study that asked economists about their experiences and satisfaction 
with the referee process in economics journals. In this comment, we put the 
paper in the context of scientific competition and ask whether we should be 
worried by its results. 


I Scientific competition and scientific quality standards 


Scientific competition is mainly driven by the quest for status or reputation. 
Researchers earn status when their contributions are used — and not just cited 
— by other researchers (see Hull 1988, 283). Status-seeking researchers should 
use the products of previous research (contained mainly in research papers) if 
they believe that it will help them to produce output that will, in turn, be used 
as an input in future research. 

In a nutshell, then, research is the production of papers by means of papers. 
In order to use the results of a paper, researchers must, of course, be aware of 
the paper and believe it to be relevant to the problems they are working on. 
Even if these conditions are satisfied, however, they will not use a paper if they 
consider it (i.e., the results and ideas contained in it) to be of too low a quality. 

Scarcity of attention and quality concerns explain the peer review system. 
Journals try to collect high-quality papers that have something in common, 
either a specific topic or, in the case of general journals, a potential to interest 
even off-topic researchers. Some journals are more successful than others in 
publishing high-quality work, get more attention, and, in turn, attract more 
submissions of high-quality work. This positive-feedback effect leads to 
quality rankings among journals. 

Ultimately, publishers compete, with the help of their journals, for the 
attention of the scientific community. There is a hierarchy of delegation, 
where all agents pursue their own interests. Publishers select and control 
editors, who, in turn, select and control referees. On each stage, there is 
competition and moral hazard. Moreover, some deviations from the appli- 
cation of quality standards, like promoting papers sympathetic to an editor’s 
or referee’s research, can be viewed as payment in kind for editorial or ref- 
ereeing services. For this reason and others, we should expect some amount of 
personal, institutional, or regional favoritism, as well as inner-scientific par- 
tisanship, in the selection of papers for publication. 
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The explanations so far assume given quality standards used by researchers 
in selecting input for their own research. It is, however, not at all clear how 
decision-relevant quality standards can become established in the production 
process we have described above. Even if everybody expects everybody else 
to use only inputs that satisfy certain standards of high quality, this expect- 
ation is not self-fulfilling — unless using high-quality inputs increases one’s 
chances to produce high-quality output. However, the last assumption is quite 
reasonable for the quality criteria used in science.! 

Quality standards in science, then, are the outcome of an intertemporal 
coordination problem among self-interested and forward-looking researchers. 
As explained above, we expect these quality standards to spill over, if 
imperfectly, into the peer review process. With perfect coordination, there 
exists a single quality standard. However, there are several factors working 
against perfect coordination. First, new methodological arguments can shift 
the focal point of the coordination game. Second, at a given time, there may 
exist several candidates for a focal point. Third, researchers have different 
information about current debates and different reaction speeds. Thus, the 
coordination process is slow and subject to shocks, working — despite the 
forward-looking attitude of researchers — like an evolutionary process of 
short-sighted adaption to one’s perception of the current trends, with several 
standards competing during adjustment. Fourth, quality standards in different 
research areas (which are defined by relatively low probabilities of use across 
boundaries) differ, which leads to grey areas where standards are uncertain or 
disputed. Fifth, quality in science has many dimensions, and weighing these 
dimensions may be a problem even if there is broad agreement about the 
dimensions themselves. 

Thus, scientific competition involves competition between quality stand- 
ards. There exists pressure in the direction of harmonizing the standards, but 
one should not expect perfect harmony, especially with respect to the fine 
points and when new ideas threaten existing standards. Moreover, quality may 
be difficult to detect. Even on the basis of common standards, different editors 
or referees may still come to different conclusions because they prefer dif- 
ferent trade-offs between errors of the first and second kind. 


2 Quality standards for peer review 
SSG suggest that journals only archive papers and hand out quality signals. In 


their conclusions, they write that peer review in science is a tribunal with 
decisive influence on individual careers. They then list the claims made in 


1 See Albert (2006) for a model explaining quality standards in scientific competition 
along these lines. 
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favor of peer review, which, as far as the journal referee process is concerned, 
are: quality control, improvement of manuscripts, promotion of innovative 
research, fostering dissemination of new research, and serving as a means to 
rank researchers. They note that, due to the publication lag, journals no longer 
disseminate new research; instead, their main purpose is “to imprint a signal 
of quality on a scholar’s research”. This, in their view, requires excellent 
performance of peer review, especially validity, impartiality, and fairness. 

Reliability of the referee process means that different referees come to the 
same conclusion. Validity of the referee process means that quality judgments 
report the true quality of the paper. For instance, stories about highly suc- 
cessful papers that were frequently rejected before their eventual publication 
are often viewed as anecdotal evidence of low validity. Typically, the degree of 
reliability and validity is measured in terms of correlations between different 
referee conclusions or between quality judgments and quality. 

Validity is ill-defined, however, and reliability is not to be expected when 
several quality standards compete. Only with (almost) perfect coordination on 
quality standards, low validity and reliability must be due to imperfections in 
the peer review process. We do not believe that perfect coordination has been 
reached in economics. 

Impartiality and fairness mean absence of favoritism and, instead, reliance 
on quality standards, which is of course possible even with competing stand- 
ards. SSG report regional favoritism. However, consider the case of the 
Quarterly Journal of Economics. This journal rejects most papers without 
referee reports but (not mentioned by SSG) publishes many previous NBER 
working papers. Since NBER papers are already subject to quality control, 
selecting from them might lower the cost of refereeing without lowering 
quality. Hence, it may be a sign of efficiency if some journals tap such pools of 
high-quality papers. Due to the nature of the NBER, this leads to regional 
favoritism. 

However, there is no prima facie case that such practices lead to an unfair 
and partisan publication system. A combination of journals with different 
biases can lead to a fair system. Moreover, in judging the quality signals 
produced by journals, it is easy to adjust for known biases: If you publish a 
paper at a journal biased against you, it just means that the quality of your 
paper is probably higher than the journal’s average. 

Editors usually want the referee to point out possible improvements of the 
paper. Within limits, this is reasonable since the editor would like to be the 
paper as good as possible and the referee can produce at least some relevant 
hints as a by-product of quality control at almost no additional cost. However, 
referees should not invest much in improving a paper. If they did, this would 
create incentives for an author to abuse referees as unpaid ghostwriters or 
conscripted audiences. It is unlikely that this would result in an efficient team 
effort. Hence, it is perfectly alright if bad papers get sloppy and short reports. 
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This sets incentives to authors to invest more into their papers (and seek for 
coauthors by themselves) and makes a better use of the scarce time of the 
referees, who can concentrate on good papers. It also implies, however, that 
editors and referees should not necessarily aim at the satisfaction of authors. 

Nevertheless, trying to measure imperfections is certainly an excellent idea. 
The question is whether the data of SSG actually point to imperfections, and 
hence whether we should be worried about the relatively poor performance of 
top journals. 

We both admit that, independently of each other, we started filling in SSG’s 
questionnaire but gave up since, due to lack of time or access, we could not 
consult our files. We both were resolved to get to it later, but as these things 
go, we never did. In line with our experience, SSG admit that those who 
persevered may have answered the questions from memory, which, as they 
recognize, may lead to systematic biases. They argue, however, that authors’ 
memory is probably what counts in submission decisions and with respect to 
author satisfaction. We agree. However, we cannot quite see why author 
satisfaction should be important, especially if, as SSG find, it depends strongly 
on the competence and care invested in the reports. 

SSG note that top journals receive lower-than-average ratings for their 
referee processes. Even if this indicated lower-than-average quality, this need 
not be problematic. Authors submitting a paper face costs in terms of sub- 
mission fees, rejection risks and decision times, which may be more or less 
mitigated by the quality of the reports. Top journals overwhelmed by sub- 
missions should offer worse terms to authors; they could do this by urging 
their referees not to waste time on any but the most excellent papers. 

However, we can think of two plausible explanations for lower-than-aver- 
age ratings for top journals’ referee processes even if these journals offer 
average quality. First, authors may just expect more from higher-ranked 
journals and judge referee processes not in comparison with each other but in 
comparison with their expectations. Second, authors can make two errors 
when submitting their papers: aiming too high or aiming too low. If they prefer 
the first error to the second, papers will on average be submitted too high, 
which, in the worst case, leads to a rejection based on a single sloppy and short 
negative report. Papers then trickle down to lower-ranked journals until paper 
quality matches journal rank. If referee reports get more careful as the gap 
between journal rank and paper quality shrinks, the trickle-down effect 
implies that author satisfaction with the referee process increases with falling 
journal ranks, even if journal policies are all the same. In this context, sig- 
nalling a large gap between journal rank and paper quality by a sloppy reports 
offers a further advantage: authors may adjust their self-assessment more 
quickly, which reduces the number of wasted submissions in the trickle-down 
process. 
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Methodology and the Constitution of Science: 
A Game-theoretic Approach 


by 
Jesus P. ZAMORA BONILLA 


1 Science as a game 


Competition is an essential element of the scientific process as it is usually 
carried out. Nevertheless, its competitive aspects have been much more fre- 
quently studied from a sociological point of view than from a philosophical or 
epistemological one, and (perhaps with the main exception of Popperian 
falsificationism and Mertonian institutionalism), the effects of competition 
and rivalry on the cognitive value of scientific discoveries have tended to be 
given an anti-objectivist interpretation. Furthermore, although competition is 
a phenomenon clearly under the scope of game-theoretic analysis (or, in 
general, of micro-economic analysis), very few attempts have been made until 
now of formally describing scientific research as a kind of ‘game’. Taking all 
this into account, the aim of this paper is to sketch some guidelines of a 
philosophical understanding of scientific research as a competitive, game-like 
process, a point of view which, in the end, will try to provide some analytical 
tools with which to assess the rationality and objectivity of scientific knowl- 
edge. 

An underlying idea of this approach will be the notion that scientific 
research can be described as a game which is played according to some rules. 
This idea can be traced back to Karl Popper’s Logik der Forschung, where the 
notion of scientific method is explicated as something more alike to ‘the logic 
of chess’ than to the rules of formal logic.' In this sense, methodological rules 
are conventions, as long as they can conceivably be as different as they are 
(actually, many of them are not equal in different scientific fields or schools, 
and also vary with time). Popper’s attempt was to justify his own preferred 
rules by somehow deriving them from the goal of maximising the ‘criticiz- 
ability’ of every scientific item, although he offered few convincing justifica- 
tions of why this epistemic value, criticizability, had to be taken as the most 
important one in science. I will not attempt to determine here what the values 
of scientific research ‘must’ or ‘should’ be: I rather think that most of the 
answer has better to be left to scientists themselves, as well as to people using 
the outcomes of science or suffering from them; but I shall nevertheless 
explore this idea of scientific norms as conventions derived in some way from 


1 As will become clearer below, I suggest that the rules of science resemble even more 
the rules of sports, like football or tennis, than of ‘logical’ games like chess. 
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the goals ‘we’ want science to fulfil (the question, of course, is who ‘we’ are?). 
From a game-theoretic perspective, two different but interrelated sets of 
questions emerge once we interpret scientific norms as the rules of a game: 
First, what will scientists’ behaviour be once certain norms have been estab- 
lished? And second, what norms would they prefer to have if they were given 
the choice? Obviously, a rational answer to the second question can only be 
given after making some prediction about how people will react under some 
norms, i.e., after having given some answer to the first question. The theory 
about how do people choose the norms under which they have to interact is 
known as ‘constitutional political economy’ (cf. Brennan and Buchanan 1985), 
and one particular goal of this paper is, then, to outline a ‘constitutional— 
economic’ approach to methodological rules (see Jarvie 2001) for a non- 
economic, but also ‘constitutional’ interpretation of Popper’s view of scientific 
norms). 

The main elements in the description of a game are the options (or ‘strat- 
egies’) of each player (or agent), the rules (i.e., an indication of which outcome 
obtains for every feasible combination of strategies, one for each player), and 
the preferences of the agents (i.e., an indication of how each player evaluates 
each possible outcome). Once this description has been given, the analysis of a 
game typically proceeds by trying to determine its ‘solutions’ or equilibria. 
Technically, an equilibrium of a game is a combination of individual choices 
such that no player can make a decision better for her than the one she has 
made, given the decisions made by the other players (Nash equilibrium). In 
general, the goal of a game theoretic analysis of a social fact is to show how 
some relevant features of the situation can be explained as an equilibrium 
emerging from the interaction of the agents. As readers familiar with the 
developments of game theory will know, one typical problem is that many 
games have more than one possible equilibrium, and in this case, either the 
outcome of the game remains indeterminate, or some stronger conditions 
must be added to justify why some specific solution is attained. It is also 
possible that no equilibrium exists, but it can be proved that, under a wide set 
of circumstances, there always are some equilibria if agents have the option of 
choosing, not directly an option, but a determinate probability of choosing 
every possible option.” Further mathematical complications result from the 
analysis of repeated or dynamic games (when players have to take a sequence 
of decisions, perhaps in a changing environment), of stochastic games (when 
the outcomes of the players’ decisions are not known with certainty), or of 
games of incomplete or asymmetrical information (when players do not know 
with certainty some possible states of nature, or some of them know more 


? This is traditionally called a ‘mixed strategy’, whereas a ‘mixed equilibrium’ is one 
that obtains by a combination of mixed strategies; nevertheless, the analysis presented in 
this paper will always stay at the level of ‘pure’, i.e., deterministic, equilibria. 
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than others). The application of these models will surely be extremely inter- 
esting and even unavoidable for understanding many features of science, but 
this paper will again offer only the most simple analysis. 


2 Scorekeeping in the game of science 


I proceed now to a description of the basic elements in the game of scientific 
research, which will essentially be conceived as a game of persuasion. That 
language is extremely important to science can hardly be denied. Authors as 
different as the logical positivist Rudolf Carnap and the post-modern 
anthropologist Bruno Latour would agree at least on this point, though 
obviously for very different reasons. The perspective I am going to take here is 
closer to Latour’s in the sense that I will assume that interaction between 
researchers mostly takes place through a continuous mutual examination of 
what each other says or writes, although I would guess that scientists can agree 
to evaluate their colleagues’ ‘inscriptions’ (to use Latour’s word) by means of 
rules which a Carnapian would not dislike too much. This does not mean, 
however, that other things besides language are unimportant; scientists also 
perform non-verbal actions, they experiment, observe, earn and spend money, 
draw diagrams, organise meetings, and so on, though it is true as well that a big 
part of these things is made by speaking, and, on the other hand, that what 
people say (and specially what they write) is usually more public than what 
they do, and so it is easier for other people to scrutinise. So, it can be 
instructive to describe the game scientists play as if their main decisions 
related to what assertions to make (probably before or after performing some 
other actions), and as if their rewards would essentially depend on what other 
people is asserting. This vision of the process of scientific communication as 
central to the strategies of researchers is not only consistent with a big part of 
the work on sociology of science of the last two or three decades, but is also 
close in spirit to some recent proposals in the philosophy of language. I am 
referring particularly to Robert Brandom’s inferentialism (Brandom 1994). 
According to Brandom, what makes a series of noises to count as an assertion 
is the chain of inferences the speech community takes as appropriate to make 
regarding that assertion, inferences which essentially relate to the normative 
status that each participant in a conversation attributes to the others (i.e., the 
things participants are allowed or committed to do by the rules of the language 
game). For example, my saying ‘there is a cat on my roof’ can be taken as an 
assertion by my hearers if and only if we share a set of normative inferential 
practices which allow them to attribute to me, under specified circumstances, 
the ‘obligation’ of presenting some relevant evidence from which that sen- 
tence can be derived, as well as that of accepting the linguistic or practical 
consequences which, together with other commitments I have made, follow 
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from it. Using a metaphor suggested by Wilfried Sellars, understanding an 
expression would amount to mastering its role ‘in the game of giving and 
asking for reasons’. It is important to mention that Brandom’s concept of 
‘inference’ does not only cover moves from sentences to sentences, but also 
from ‘inputs’ of the language game (e.g. observations) to sentences, as well as 
moves from sentences to ‘outputs’ of the game (e.g., actions). 

The aspect of Brandom’s theory I want to emphasise is that linguistic 
practice proceeds by each speaker ‘keeping score’ of the commitments made 
by the others and of the actions commanded by those commitments, according 
to some inferential rules which define the language games which are possible 
within their speech community. It is this idea of ‘scorekeeping’ that will be put 
into use here in order to analyse the game of science. I propose to consider the 
‘inscriptions’ produced by a researcher as her set or ‘book’ of commitments 
(her ‘book’, for short). There is no need that every such commitment amounts 
to the bare acceptance of a certain proposition (say, A), for it is possible to 
make a variety of qualified (or ‘modalised’) commitments, as ‘it seems likely 
that A’, ‘there is some evidence that A’, ‘A deserves some attention’, and so 
on. The game theoretic nature of scientific research arises because each scien- 
tist’s payoff depends on what is ‘written’ not only on her own book, but on the 
book of each other member of her community. This payoff is generated by 
three interacting factors: an internal score, an external score, and a resource 
allocation mechanism, all of which are determined by several types of norms. 
In the first place, any scientific community will have adopted a set of meth- 
odological norms with which to assess the scientific value of any set of com- 
mitments; the coherence of a researcher’s book with these norms (or, more 
precisely, the coherence her colleagues say it has) will determine the internal 
score associated to that book. Second, and in contrast to the case of everyday 
language games, in science many propositions are attached to the name of a 
particular scientist, usually the first who advanced them; one fundamental 
reward a scientist may receive is associated with the fate that the theses (laws, 
models, experimental results...) proposed by her have in the books of her 
colleagues. This ‘fame’ is what I call here her external score. The combination 
of the internal and the external score associated to a book is its global score. 
Third, the community will work with a set of rules for the allocation of re- 
sources which will determine how much money, what facilities, what work 
conditions, what assistants, and so on, will be allotted to each scientist, 
depending on her global score. 

So viewed, the game of scientific research proceeds as follows. The meth- 
odological norms of a discipline tell each researcher what things can she do 
(or must she do) in order to write a book with a high internal score; this will 
make her count as a more or less ‘competent’ researcher. These norms indi- 
cate how to perform and report experiments, what formal methods to employ 
and how, what types of inductive inferences are appropriate, what styles of 
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writing are acceptable, and so on. By following these norms, she will end up 
committing herself to the acceptance of some propositions advanced by other 
colleagues, hence contributing to their having a high external score. She will 
also have to comment on the coherence of her colleagues’ commitments with 
the methodological norms of the discipline, contributing to rising or lowering 
their internal score. On the other hand, in order to reach a high external score, 
she has to take advantage of her colleagues’ struggle for attaining a high 
internal score: she has to be able to devise experiments, hypotheses, or models 
which her colleagues, given their previous commitments, and given the 
accepted methodological norms, cannot refuse to accept without running the 
risk of high losses in their internal scores. 


3 An example: the first gravity wave experiments 


To exemplify the applicability of a game theoretic approach to the analysis of 
scientific research processes, I shall take H. M. Collins’ classic narration of the 
dispute about gravity waves which followed the experiments of the physicist 
Joseph Weber (Collins 1985). According to Collins, Weber’s results strongly 
conflicted with accepted cosmological theories (for his experiments indicated 
an amount of gravitational energy too big for our nearby universe to be stable), 
and nearly all attempts of replication failed to show the same results (although 
no one of them was indisputably negative taken in isolation). Under these 
circumstances, the other members of the scientific community chose to reject 
Weber’s results, and decided that presumed ‘gravity wave detectors’ detected 
nothing at all because there was no signal strong enough to be detectable; this 
means that the community did not assign a high score to Weber, neither an 
external score (for his presumed results were not accepted), nor an internal one 
(because deficiencies in his methods were pointed out). In spite of this, Weber 
went on defending his experiments and trying to improve them. From a game 
theoretical point of view, the first relevant question is whether all those deci- 
sions (both Weber’s and those of his critics) were rational and mutually con- 
sistent, i.e., whether they constituted a Nash equilibrium. For example, could 
Weber have made a better decision at some point of the process? It is very 
likely that he could have forecasted the negative reaction of the community; 
furthermore, he might have acknowledged that he was wrong when non-con- 
firmatory results begun to appear in the experiments of some colleagues. That 
he did not take this decision seems to indicate that he severely misrepresented 
the chances of his ‘discovery’ being recognised; then, though his decisions might 
have been ‘optimal’ given his own (over-) estimation of success, this estimation 
had to be very wrong, and so in a sense he was acting ‘irrationally’, at least from 
the cognitive point of view. If we did not want to accuse Weber of irrationality, 
we would have to look more deeply into his view of the situation. 
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On the other hand, what about the decisions of his colleagues? Given that 
most experiments were inconclusive, that the acceptance of Weber’s results 
might have forced a toilsome reformulation of much accepted knowledge, and 
that this reformulation would demand the cooperation of many theoreticians 
and experimenters, the decision of waiting till a ‘significant’ number of col- 
leagues had made a decision seems logical for a majority of researchers. 
Nevertheless, for those who had more to loose or to win if Weber was right 
(for example, because their prestige strongly depended on the theories which 
negated the existence of detectable gravity waves, or because they expected to 
contribute with new discoveries in the line of Weber’s if he happened to win) 
it seemed rational to attempt to replicate the experiments soon, as many did. 
In conclusion, the resolution of the debate looks like a Nash equilibrium, since 
everybody chose her best option, given what the other people were doing, 
although perhaps Weber himself suffered from a strong confirmatory bias (i.e., 
his decision was rational according to the beliefs he actually had about the 
probability of successful replications, but these probability judgements were 
somehow defective). 

The situation, however, is not so simple once we consider more deeply the 
strategies available to each researcher. For example, imagine you are one of 
those who are waiting for more information from your colleagues before 
deciding what to do with Weber’s assertions. Your options are not just ‘accept’ 
and ‘reject’, but rather ‘accept if at least ten percent of the community accept; 
reject otherwise’, ‘accept if at least fifty percent accept; reject otherwise’, and 
so on, or even something more complicated, because you will surely take also 
into account your own degree of belief in the validity of the disputed 
hypothesis. The scientific community must be in an equilibrium also with 
respect to the decisions about when there are ‘enough’ reasons in favour of a 
proposition for it to become acceptable.* On the other hand, what about 
people trying to replicate Weber’s experiments? If it is true that the ‘com- 
munity leaders’ have so much prestige that their own conclusions would 
‘trigger’ a consensus around them, they had at least a choice between per- 
forming the experiment as carefully as possible or not, as well as a choice 
between describing the results in the most neutral way or in a way which is 
favourable to their preferred theories. As long as they suspect that their 
declarations will virtually close the debate, they will be strongly tempted to 
choose the second option in both cases, especially when there is only one 
leader (according to Collins, in the dispute about gravity waves this role was 
played by Richard Garwin). A full analysis of the episode in game-theoretical 
terms should indicate, hence, what the reasons of the leaders were for 


3 See also Zamora Bonilla (2002) for an analysis of a possible agreement among 
researchers about how much ‘corroborated’ a hypothesis must be for making its 
acceptance compulsory. 
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behaving ‘honestly’ (if they did it), and also what reasons the remaining sci- 
entists had for accepting the leaders’ assertions, especially if it was not clear 
which strategy the leaders were to use. The mathematical model I present in 
section 5 allows to explain why scientists may choose to behave ‘honestly’ very 
frequently (though not always). 

The next question is whether other possible equilibria could have existed. 
Collins himself strongly sympathises with this possibility, for he repeatedly 
asserts that every scientific controversy might have been ‘closed’ in a different 
way. For example, in the case of Weber, the community might have accepted 
the existence of gravity waves, since (according to Collins) the experiments 
did not point too much clearly in the opposite direction. In that case, Weber 
would have been recognised as an important contributor to the advancement 
of knowledge. But what about the other members of the scientific commun- 
ity? Would all of them necessarily have found it profitable to accept gravity 
waves given that the others had accepted them? Probably not, because, 
lacking a powerful theory to explain why these waves are as they were 
accepted to be (under these counterfactual circumstances), some researchers 
could still have opted for defending the old theory and rejecting Weber’s 
results. This simply means that, if other equilibria exist, some of them can in 
principle correspond not to a full consensus around the new result, but to a 
division of the community into two or more rival ‘schools’. Nevertheless, even 
if a unanimous acceptance of Weber’s results were a Nash equilibrium, it is 
very likely that it would be judged by most scientists as worse than an almost 
unanimous rejection. This is again because of the absence of a theory which 
can accommodate those results: with the help of the old theory they expect to 
be able of solving still many problems, whereas the prospects for scientific 
merit under the other scenario are much more uncertain. In a nutshell, had all 
of Weber’s colleagues accepted his results, they would surely have got, on 
average, a payoff below the payoff from almost unanimous rejection. 


4 Scientific norms as the constitution of science 


Although nearly all the choices individual scientists have to make refer to 
decisions whose outcome will depend on their coherence with the norms 
prevailing within their scientific community, the norms themselves must also 
be selected, for they are, after all, social conventions. As I said in the first 
section, the perspective advocated in this paper is that of constitutional 
political economy; so I assume that the norms governing scientists’ inter- 
actions are to be chosen by those scientists themselves, and I will ask what 
properties the norms can be expected to have according to that assumption. 
After all, though it is true that a single scientist can do little or nothing to 
substantially change the norms of her community, these can be easily changed 
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by means of a collective agreement. A unanimous or almost unanimous 
agreement about a norm can sometimes derive from its adoption by only a 
part of the community, for, if enough colleagues accept it (both in their 
practice and in their public assessment of the others’ internal scores), prob- 
ably many others will find it profitable to do the same. On the other hand, 
since most norms are better understood as regular practices than as explicit, 
well-defined precepts, these practices can also change smoothly by small 
individual adaptations to changing circumstances. In any case, if the norms 
have to be collectively accepted, it is absurd to assume, as constructivists 
sometimes claim (e.g., Latour and Woolgar 1979, Latour 1987), that they can 
be ‘imposed’ on the whole community by a small group of researchers, save, 
perhaps, when these have a monopoly over the material resources which are 
necessary for the rest. For example, a norm designed just to favour a particular 
theory or model would be rejected by those scientists who are proposing 
different ideas. 

Another relevant aspect of norms is that they tend to be in force for a long 
time (usually, more than the mean life of most theoretical models, for 
example). This has two important consequences regarding the epistemic 
properties of the norms. First, it is hard for researchers to guess what models 
they will be proposing in the future, when the norms which they are choosing 
today will still be in force; so, under a more or less thick ‘veil of ignorance’, 
and assuming that no monopoly over material resources exists, impartial (and, 
in particular, epistemic) criteria will preferably be employed in order to dis- 
cuss the acceptability of a norm. Second, at any moment in the evolution of a 
scientific discipline, prevailing norms will probably have evolved in order to 
help the community members in their striving to find acceptable results. This 
entails that methods, models, laws, and even styles of research, which had been 
accepted after being evaluated with impartial norms, may become a norm 
themselves (for example, your paper can be rejected in a physics journal if it 
contains a model which contradicts Maxwell’s equations, even if it is meth- 
odologically sound in any other respect). As a result, the methodological 
norms a community has at a given moment can be an obstacle for the 
adoption of new ideas; scientists will only take seriously the possibility of 
abandoning the old norms when the prospects of finding out new results which 
are acceptable according to those norms begin to decrease, as compared to the 
new norms. Although the last two points seem mutually contradictory, it is 
possible to accommodate them in the following way: after all, the arguments 
employed to defend the new norms must be based on some methodological 
criteria of a higher level; this means that these criteria are thought to be in 
force during a longer period, and in a wider field; so, these ‘metanorms’ will 
very probably be impartial and epistemically sound, or at least, more so than 
the ‘lower level’ norms which are assessed with their help. 
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With respect to the norms which serve to determine the value of internal 
scores, we can distinguish three different kinds. In the first place, there must 
be some norms about the disclosure of information, i.e., rules indicating which 
of your commitments have to be inscribed in your book; they can also 
determine who is entitled to access each part of another’s book, for not all its 
parts need be equally public. Obviously, the interests of the scientists in 
establishing some rules of disclosure of information instead of others can 
depend on the existing technical and institutional possibilities for getting re- 
sources or other benefits by using that information in a non public way.* 
Second, some norms (which I shall call inferential norms) must establish what 
kinds of inferences from a set of actual or hypothetical commitments to 
another are mandatory, discretionary, or forbidden; these are the norms whose 
fulfilment within a book is easier to check, for usually they only demand to 
analyse the ‘inscriptions’ contained in the book (Zamora Bonilla 2002). Third, 
further norms must refer to the coherence of a book’s ‘inscriptions’ with 
something external; these norms serve to introduce in the books ‘inputs’ which 
are not just inferred from other commitments already contained in them. 
Usually, norms of this type establish the conditions under which a researcher 
or group of researchers are entitled to introduce a new inscription in such a 
way that their colleagues are committed to accept it by default, i.e., unless they 
manage to present a justifiable chain of commitments which lead to a different 
conclusion. Norms governing laboratory protocols and demanding repli- 
cability are of this kind. The most important point about these norms of 
observation is that they do not need to refer to an ‘indubitable empirical basis’, 
or something like that, for it is enough that scientists find it advantageous to 
play the game defined by these norms (amongst others). However, as long as 
the results of a discipline have some practical consequences, on which scien- 
tists’ payoffs may depend, it is sensible to assume that a discipline whose rules 
of inference and of observation lead systematically to mistaken practical 
conclusions will cease to get the resources it needs. So, the members of a 
scientific discipline will have an interest, if only for this reason, in collectively 
adopting a system of rules which is efficient in the production of (approx- 
imately) true statements. 


5 The effectiveness of the scientific constitution 
The status of norms is one of the most fiercely debated points in the sociology 
and the philosophy of science. Without assuming that the game theoretic 
approach can offer a definitive solution to all the problems related with sci- 


entific norms, it can be useful, at least, to illuminate some deficiencies of other 


4 Dasgupta and David (1994) is the main reference on this topic. 
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approaches. For example, functionalists, such as Robert Merton, tend to argue 
as if indicating the virtues of a norm from a ‘collective’ point of view were 
enough for explaining why this norm is accepted and obeyed by the individuals 
forming that collective. Obviously, those cases where the interests of the 
individual and those of the ‘group’, whatever this means, are in conflict pose a 
problem for this approach, for it leaves unexplained just why an individual 
decides, in the first place, to approve the rule, and, in the second place, to act 
according to it. Constructivists, in their turn, tend to talk about norms as if 
they were either mere rhetorical devices, or mechanisms for benefiting some 
privileged group. In this case, the problem is that, although this approach can 
explain why some people may have an interest in proposing or using some 
norms, it does not explain why others (knowing that the norms are just rhet- 
orical strategies for defending the interests of some) actually behave as if they 
also accepted these norms. In contrast, from a game theoretic point of view, 
individuals ‘obey’ the norms just because it is in their own interest to do it 
(though social influences on individual preferences are not discarded a priori). 
This means that a system of norms will be stable if and only if it constitutes a 
Nash equilibrium, i.e., if, under the assumption that the others obey the norms, 
anyone’s best option is also to obey. For example, given that most people 
speak a certain language in a country, it will be in my interest to do the same; 
given that firms and public administrations hire people according to their 
academic certificates, it will be in my interest to try to get some; given that 
judges and policemen do efficiently their work according to the prevailing 
civil and criminal laws, it will be in my interest to obey these. As it is clearly 
shown in these examples, when ‘obeying certain norms’ includes ‘punishing 
those who do not obey’, general compliance with the rules is to be expected 
(Axelrod 1984, Elster 1989). In the case of science, this is reflected in the fact 
that a researcher’s book is permanently evaluated by other colleagues, 
whose evaluations are contained in their respective books, which are eval- 
uated by other scientists, and so on. For example, I will be punished if my 
model violates the law of energy conservation, but also if I fail to criticise a 
colleague whose model makes this mistake. So, the fact that a certain norm is 
followed by a high proportion of my colleagues makes not obeying very costly 
for me. 

Nevertheless, it is clear that disobedience may sometimes provide great 
advantages, particularly if the chances of not being discovered are high. I can 
manipulate experimental results, or fail to put enough effort in my work, or 
fail to disclose some information that the norms command to publish, and so 
on. The sociological literature is full with case studies showing how scientists 
‘misbehave’, at least according to the rules they (scientists) preach, not to 
speak of the rules preached by the philosophers. Even some institutional 
mechanisms (which are norms themselves) may have the perverse effect of 
rewarding this type of misbehaviour (for example, the ‘publish -or-perish’ 
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practice).° The persistence and the spread of an institution like science, where 
most fundamental things depend on the trust people put on other people’s 
assertions, demands, however, that misconduct is severely limited, particularly 
in those cases where the fate of a discipline is at stake. Actually, science seems 
to attain this goal rather well even in the absence of something like a ‘police’ 
or ‘courts of justice’. The question is, hence, whether the mechanism of mutual 
checks described in the last sections is strong enough for deterring researchers 
from systematically disobeying the prevailing rules. If the answer were ‘no’, 
then either the public trust in scientific results should be much more fragile 
than what is usually presumed by the scientistic rhetoric, or the apparent 
stability of so many portions of scientific knowledge would just be based on 
scientists’ exceptional honesty. I hope, however, that the following toy model 
may allow to avoid this dilemma. 

Let f be the frequency with which a researcher disobeys the norms, and 
suppose, for simplicity, that all infringements are equally important (if this is 
not the case, then fcan be alternatively interpreted as a normalised average of 
an individual’s infringements). Let u(f) (>0) be the utility received by a 
scientist if she is not discovered and disobeys the norms with frequency f, and 
let — v( f) (<0) her disutility if she is discovered and hence punished. In this 
model, punishment basically consists in reducing a researcher’s internal score, 
e.g., by not accepting her papers for prestigious journals or congresses. The 
probability of being discovered, p(f), is an increasing function of f I will 
assume that the functions u, v and p are equal for all the community members. 
Given these assumptions, an individual’s expected utility from disobeying the 
norms with frequency f is EU(f)=(1—p(f)u(f)—p(f(f) =u(f) - 
P(f)(u(f)+v(f)), and the optimum infringement frequency for her corre- 
sponds to that value of f which maximises EU(f). On the other hand, it is 
reasonable to assume that an individual’s utility depends on the frequency 
with which the norms are disobeyed by other researchers: the more frequently 
norms are infringed by your colleagues, the less utility will you get from the 
same actions (for example, because by producing outputs of a lower quality, 
the scientific community obtains less resources from society). Hence, a sit- 
uation where f were low for all, would also be better for everyone than a 
situation where f were high for all. The essential question is, of course, 
whether in a situation of equilibrium the f’s will be ‘high’ or ‘low’. In order to 
answer this question, I will add some more simplifications: first, suppose that 
P(f) is just equal to f (i.e., the probability of being discovered is the same as 
the frequency of infringement); second, assume that u( f) and v(f) are linear 
functions of f, in particular, u(f)=a + bf, and v(f)=cf (with a, b, c>0); this 
entails that u(0) is positive (you get a positive payoff by not disobeying the 
norms) and v(0) =0 (you are not punished if you obey the norms); lastly, your 


5 See Wible (1998) for a good rational choice approach to the study of scientific fraud. 
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utility will also depend on the average frequency of infringement within the 
rest of your community, f, so that u(fJ=(1-f)(a+bf) (i.e., even if your 
infringements are not discovered, you get a null utility if norms are always 
disobeyed), whereas v( f) does not depend on f (i.e., you will be punished by 
your infringements independently of how frequently your colleagues disobey 
the norms). 

Under these assumptions, a researcher’s expected utility is given by 


EUFD=-A-FKa+bf) LP) flef) 
(1) =- f?(b(1—-f) +c) + f(b-a) (1-£) +.a(1-f) 


Individual maximisation of (1) is reached when 0EU/of=0, which yields the 
optimal frequency 


(2) f*(f) =[(6— a) A = £)]/2(60. = £) + c). 
Some useful consequences are the following: 
(3) a) f*(f) < 1/2 if a< b 


b) f*(f)=0 ifa>b 
c) df*/df =— (b — a) c/[(2(b(1 —f) +.0))"] <0 if a<b 
d) |df*/df|<1ifa<b 


Thus, (3a) says that your optimal frequency of infringement is less than fifty 
percent. Furthermore, the bigger the reward a from always obeying the norms, 
and the stronger the punishing reaction c to your infringements, the smaller 
will this optimum frequency be. For example, if punishment is at least as 
strong as the benefits you get from disobeying (i.e., if c > b), then f* will be 
smaller than 1/4. Note also that, if a > b, then f* will be 0 according to (3b), for 
in that case EU is decreasing within the interval [0,1]. On the other hand, (3c) 
says that your optimum frequency of infringement decreases as the average 
frequency of your colleagues rises. This result essentially derives from the 
assumption that you are not less punished for your transgressions when your 
colleagues commit more infringements in the aggregate. Regarding other 
types of social norms, this need not be true; for example, when the police 
works less efficiently, it is more probable that you will not be punished 
because of your crimes (although you can be ‘punished’ even if you do not 
commit any crime), and this provides a reason to commit more crimes. 
However, in the case of science, researchers want essentially to have a global 
score higher than their colleagues’, and this entails that they will hardly miss 
the opportunity of denouncing your infringements, even when they commit 
many. So, our assumption that f affects u but not v simply means that, the 
higher is f, the less willing will your colleagues be to recognise your merits, 
though they will always be prone to punish you. Lastly, (3d) will be useful in 
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proving the next theorem; it can be derived from (3c) by taking into account 
that f is according to (3a) necessarily less than 1/2 if each researcher takes a 
rational decision. 

The main result of this simple model is the following: 
There is only one Nash equilibrium, which corresponds to the case where all 
researchers disobey the norms with a frequency f such that f*(f) =f. 


Proof: In figure 1a (see p. 276), this equilibrium corresponds to the point 
where the function f*(f) crosses the line of 45 degrees (the identity line); let d 
be the frequency associated to that point. In the first place, it is easy to see 
that all scientists disobeying the norms just with frequency d is a sufficient 
condition for a Nash equilibrium, because in that case the best option for 
every researcher is choosing exactly f=d. To see that it is also a necessary 
condition, suppose first that all researchers chose another frequency, as d, in 
figure la; this can not be an equilibrium, because the optimum response 
would not be dy, but e, and hence, researchers would not be acting rationally. 
Suppose next that there were an equilibrium in which not all researchers 
chose the same frequency. In this case, the average frequency could be equal 
to d, higher (e.g., d,), or lower (e.g., d,). Suppose first that it were d (figure 1b, 
see p. 276), and take one of the scientists which disobey the norms with the 
highest frequency, h; if h is the frequency chosen by her, this means that the 
average of the rest of the community (i.e., of every member save i) must be g, 
if her decision is rational; hence, the average of the full community is at most 
(h+)/2, which is necessarily less than d, because | df*/df|<1, and so the 
community average can not be d. In the second place, suppose that the 
community average were d,<d (figure Ic, see p. 276); in this case, again, a 
scientist selecting the highest of the chosen frequencies (h) will be responding, 
if rational, to an average of the rest of the community equal to g, but then the 
community average is at most (h + g)/2, which is again less that d, because 
of the same reason. In the third place, suppose that the average is d,<d 
(figure 1d, see p. 276); in this case, a rational scientist which chooses the lowest 
of the selected frequencies (k) will be responding to an average of the rest of 
her community equal to g’, and the average of the full community will be 
higher than (k + g’)/2, which is higher than d}. Hence, the assumption that in 
equilibrium not all researchers choose frequency d leads necessarily to con- 
tradictions. 

In conclusion, given the type of mutual control the members of a scientific 
community exert over themselves, we can expect that an equilibrium arises in 
which the norms are followed with a ‘high’ frequency. Perhaps this situation is 
not ideal for scientists, nor for citizens, all of which could get a higher utility if 
the norms where always obeyed; but surely other situations are possible that 
would be much worse. In fact, the picture which derives from our model seems 
to be more realistic than that offered by the ‘deconstructionists’ referred to at 
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Figure 1 Determination of the equilibrium rate of compliance with norms 
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the beginning of this section, since according to them methodological norms 
are not designed to be ‘obeyed’ at all, but just to be used as weapons in sci- 
entists’ rhetorical tournaments. In contrast to this, researchers seem to follow 
methodological norms in a very systematic way, though not perfectly, and this 
is what makes their infringements so salient when they are discovered (either 
by their colleagues, or by the social students of science). The model presented 
in this section makes this type of behaviour understandable in the case of 
people who, as most social studies show, are not exclusively motivated by an 
altruistic desire for truth. 
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Is It a Gang or the Scientific Community? 


Comment by 


GEBHARD KIRCHGÄSSNER 


The paper by Zamora Bonilla intends to show that a game theoretic approach 
is helpful and perhaps even “unavoidable for understanding many features of 
science” (p. 263). He interprets science as a game and the outcome of the 
scientific process as a Nash-equilibrium. To underline his point, he constructs 
a model of this process and he shows that the equilibrium outcome of this 
process is unique. Thus, his approach might be seen as being an additional 
version of economic imperialism. While this was in the past mainly directed 
towards other social sciences, it is now also directed towards philosophy 
(sociology) of science. 

Despite the fact that this paper is very interesting to read as it provides 
many interesting historical details which might be interpreted in a game 
theoretic framework, it is not clear whether this paper reaches its objective (or 
is at least an important step in this direction). In particular, there are three 


questions I want to raise: 
(i) Does the model really represent the characteristics of the scientific 


process, or is it just the behaviour of some (special) groups, like criminal 
gangs? Should it be modified to account more for specific aspects of this 
process? 

(ii) Does the game theoretic approach (the model) add anything to our 
understanding of the scientific process? Is this approach really ‘unavoidable’? 
Or is it just a new language game? 

(iii) Can this paper convince somebody who is not familiar with the game 
theoretic approach that this approach can add something to our understanding 
of the scientific process? 


In the following, I will mainly discuss the model, its limitations and some 
extensions. In doing so, I take on the traditional economic perspective and try 
to give an answer to the first question. Finally, however, I will take on the 
perspective of a social scientist who is not an economist and try to give 
answers to the other two questions. As far as I can sce, despite some inter- 
esting insights in this paper, the model is neither very specific for the scientific 
process (community) nor will this paper convince other social scientists who 
do not already follow the rational choice approach that the game theoretic 
perspective leads to new insights. To reach this, we would need ‘new’ results, 
i.e. insights about the scientific process derived by applying the game theoretic 
approach which at least partially contradict traditional beliefs but are sup- 
ported by empirical observations. 
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I Does the model describe the scientific process? 


The model describes a group of identical individuals with an own internal 
norm system which is different from (or in addition to) the norm system 
outside this group. Violation of these norms occurs with frequency f,0<f<1, 
and, if it is not detected, provides a benefit u, which is (for simplicity) assumed 
to be linear in the violation of the law, 


(1) u(f, F)=(a+bf)-F), 


with parameters a, b >0, where F is the average violation of the norm in the 
rest of the group. The probability that a violation will be detected is also F, and 
the punishment v is 


(2) v(f)=cf,c>0 
with c >0. Thus, the objective function 
(3) Elu(f Pl=A-f) (+ 5f)-F)-ff) 
is to be maximised, which leads to the first order condition 
(4) 2f(b1-F)+c))+(b-a)=0, 
and to the solution 
(b —a)(1 — F) 

= ES, 
(5) Gl re re. 
The second order condition is 
(6) —2(b(1—F)+c))<0 for F<1. 


If a>b, the right hand side of (5) is not positive and, therefore, f*=0. If 
0<a<b, then 


(7) | df*/dF|<1 


Thus, f(-) is contractive which — according to the Banach Contraction Prin- 
ciple — ensures that the fixed point f= F is unique and stable.! 

There are several problems connected with this model: 

(i) Punishment is assumed to be exogenous, it is without costs, depends on f, 
but not on F. This does not seem to make much sense, as this assumes that 
punishment takes its maximum when all individuals violate the norm (f= F= 
1). This is hardly plausible, as in this case one would rather expect no pun- 
ishment at all. Thus, let us assume that punishment depends on F, v =v (f, F). 


' See, e.g., Dugundji and Granas (2003, 9f.). This substitutes the rather cumbersome 
proof in the paper. 
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The more a law is violated, the smaller is usually punishment, Ov(f, F)/OF < 0. 
There are many cases where behaviour, which originally was strictly forbidden 
and, if detected, severely punished, was legalised when it became common. 
This holds, e.g., for abortion, but in some cases also for consumption of illegal 
drugs. Thus, a reasonable assumption is v(f, 1)=0, ie. if all members of a 
group violate the norm, there is no punishment at all. Assuming a similar 
relation as in the utility function, the punishment function might be written as 


(8) v(f F)=cf:(1-F), 
with c > 0. The solution of this model is 

: b-a 
K 2640)" 


which makes the optimal norm-violating behaviour independent of the 
behaviour of the rest of the group. The reason for this is that the utility as well 
as the punishment depend in the same way on the frequency with which the 
others violate the norm. To come to a solution where the norm violating 
behaviour depends on the behaviour of others, we would need two different 
(plausible) functional forms of how the utility from violating the norm as well 
as the punishment if such a behaviour is detected depend on the average 
frequency of violation.” 

(ii) Punishment has costs and benefits. Pointing to the norm violation of 
others might increase the own reputation, but also demands resources and, 
what may be more important, can produce enemies. But if it reduces their 
utility, why should people punish others? This, at least, is not consistent with 
the primitives of neo-classical theory which are the basis of this model. 


On the other hand, modern behavioural economics has shown that there are 
people who take the costs and punish others, even if it is costly.? Their 
behaviour is essential to ensure that norms are observed within a society 
which are not sanctioned in a formal way (e.g., by penalties through the 
judicial system). However, not all individuals behave in this way. There are 
‘altruists’, who behave in this way, but also ‘egoists’ who follow the classical 
economic assumptions. Thus, to describe the equilibrium outcome of this 
game we have to distinguish (at least) two kinds of individuals; a model 
assuming identical individuals is hardly able to correctly describe the scientific 
game. 


? One might also question why the utility from strictly following the norm, a, depends 
on the behaviour of others and is zero whenever all violate the law. However, making 
this utility independent which implies setting a=0 and including a constant into the 
utility function (3) would not change the character of the solution. 

3 See, e.g., Fehr and Gächter (2002). 
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(iii) Finally, which are the norms or rules to be observed? And how 
autonomous are scientists in setting these rules? We can distinguish three 
kinds of rules: 

a) Basic rules, like the rules of logic, or the norm that experimental results 
should be reported correctly. 

b) Rules which constitute the hard core of the paradigm (or of a scientific 
research programme). 

c) Rules which belong to the security belt of the research programme. 


The model applies only to the basic rules (a). These are the ones where 
individuals try to hide their violations and where punishment is to be expected 
if these violations become public. Rules which belong to the security (c) belt 
may, on the other hand, be suspended at any time without major con- 
sequences. The perhaps most interesting rules are those which constitute the 
hard core (b). In neoclassical economics, these rules, e.g., demand basing 
theoretical models on the primitives of this theory. Violations of these rules 
are punished, but the violators do not try to hide them. Just the contrary is the 
case: to reach the benefit of the violation it is necessary that the violations 
become public. If the violators are successful, this might lead to a change of 
the paradigm in the long-run and to scientific reputation. These violations 
might be seen as risky investments into the own future scientific reputation. 

Though the model in the paper does not apply to these rules, the paper also 
deals with them. And this is reasonable, as violations of these rules are often a 
precondition for scientific progress. But this aspect is not discussed in the 
paper. 

Thus, this model covers only part of the norms of the scientific process, 
perhaps not even the most relevant ones. It is just a model of norm-violating 
behaviour which could be applied in the same way (or perhaps even better) to 
any other group with an internal norm system and, therefore, also to a gang of 
criminals. Insofar, it is of very limited value for understanding the scientific 
process, and it is at least debatable whether it really covers the essentials of 
this process. From a perhaps somewhat naive point of view, the objective of 
science is the generation of true statements about reality. This holds, despite 
the fact that we can never be sure that a scientific statement is true. A model 
which describes the scientific process should — at least in my opinion — give 
reference to it. This would also distinguish it from a model of a criminal gang. 


2 Is game theory unavoidable to understand the scientific process? 
Even if this model does meet the essentials of the scientific process, it can be 


asked whether the game theoretic approach is really “unavoidable for 
understanding many features of science” (p. 265). Obviously, science is an 
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interactive game played in a community which has a set of rules which the 
actors more or less obey. Thus, the language of game theory can be used for 
describing this process. It is true, that the system of rules of the scientific 
community “will be stable if and only if it constitutes a Nash equilibrium, i.e. if 
under the assumption that the others obey the norms, anyone’s best option is 
also to obey” (p. 272). But is this sufficient to proof that there is an added 
value in applying game theory? Is then, therefore, each application of the 
economic model of behaviour to human interaction an application of game 
theory? 

According to my opinion, to be justified, the application of game theoretic 
concepts should give more and new insights in addition to what we already 
know. The idea which is behind the definition of an equilibrium given above is 
much older than the formal treatment by Nash (1950), and has been applied 
long before by economists.* Moreover, one of the basic characteristics of the 
scientific process is that it is never in equilibrium and that its rules are per- 
manently changed. This raises the question whether the concept of a Nash- 
equilibrium is well suited to describe the core characteristics of the scientific 
process. Would not (at least) a model of a dynamic game be necessary? 

As mentioned in the introduction, in the first parts the paper provides many 
interesting details of and insights into the scientific process. However, this is 
quite independent of the use of the language of game theory. It is to hope that 
game theory, when used for analysing the scientific process, can provide new 
insights in the future, and taking this paper as a first attempt, we should 
perhaps not be too critical. Nevertheless, this paper does not provide such 
insights; it presents game theory as a new rhetoric or language game of con- 
cepts we have known before. Consequently, this paper will hardly convince 
any non-economist (or non-game-theorist) of the usefulness of the game 
theoretic language game. 
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Distributed Cognition: 
A Perspective from Social Choice Theory 


by 
CHRISTIAN LIST* 


1 Introduction 


“Distributed cognition’ refers to processes with two properties. First, they are 
cognitive, i.e. they involve forming certain representations of the world. Sec- 
ond, they are not performed by a single (human) agent, but are distributed 
across multiple (human) agents or (technical) devices. Distributed cognition 
has attracted interest in several fields, ranging from law (e.g., jury decision 
making) and sociology (e.g. information processing in organizations) to 
computer science (e.g., GRID computing) and the philosophy of science (e.g., 
expert panels). 

An influential account of distributed cognition is Hutchins’s (1995) study of 
navigation on a US Navy ship. Hutchins describes the ship’s navigation as a 
process of distributed cognition. It is a cognitive process in that it leads to 
representations of the ship’s position and movements in its environment. It is 
distributed in that there is no single individual on the ship who performs the 
complex navigational task alone, but the task is performed through the 
interaction of many individuals, together with technical instruments. At any 
given time, no single individual may be fully aware of the navigational process 
in its entirety. Thus, on Hutchins’s account, the ship’s navigation is performed 
not at the level of a single individual - a ‘chief navigator’ — but at the level of a 
larger system. 

In the philosophy of science, Giere (2002) argues that many scientific 
practices, especially large-scale collaborative research practices, involve dis- 
tributed cognition, as these practices are “situation[s] in which one or more 
individuals reach a cognitive outcome either by combining individual 
knowledge not initially shared with the others or by interacting with artefacts 
organized in an appropriate way (or both)” (2002, 641). He distinguishes 
between ‘distributed’ and ‘collective’ cognition, where the first is more gen- 


* Although earlier versions of this paper were presented at seminars at the Australian 
National University, the London School of Economics and the University of Liverpool in 
2003, this revised version draws substantially on related technical material in List 
(2005b), but offers a different interpretation of this material. I am grateful to the seminar 
participants at these occasions and particularly the organizers and participants of the 24% 
Conference on New Political Economy in Saarbriicken, October 2005, for comments and 
suggestions. I especially thank Max Albert and Siegfried Berninghaus for their helpful 
comments. 
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eral than the second. Distributed cognition includes not only cases of col- 
lective cognition, where a cognitive task is distributed across multiple 
individuals, but also cases where such a task is distributed between a single 
individual and an artifact, such as a technical instrument.! While researchers 
often compete with one another, collectively distributed cognition is a phe- 
nomenon associated with more cooperative practices within research groups 
or communities. 

Knorr Cetina (1999) provides a case study of distributed cognition in sci- 
ence. Studying high-energy physics research at the European Center for 
Nuclear Research (CERN), she observes that experiments, which lead to 
cognitive outcomes, involve many researchers and technicians, using complex 
technical devices, with a substantial division of labour, expertise, and 
authority. She describes this research practice as “something like distributed 
cognition” (25, cited in Giere 2002). 

Other instances of distributed cognition in science can be found in multi- 
member expert committees. For example, in 2000, the National Assessment 
Synthesis Team, an expert committee commissioned by the US Global 
Change Research Program with members from governments, universities, 
industry and non-governmental organizations, presented a report on climate 
change. Such a committee’s work is cognitive in that it involves the repre- 
sentation of certain facts about the world; and it is distributed in that it 
involves a division of labour between multiple committee members and a 
pooling of different expertise and judgments. Here it may be more plausible to 
ascribe authorship of the report to the committee as a whole rather than any 
particular committee member. 

In this paper, I discuss collectively distributed cognition from the per- 
spective of social choice theory. Social choice theory can provide a general 
theory of the aggregation of multiple (individual) inputs into single (collec- 
tive) outputs, although it is usually applied to the aggregation of preferences. 
Drawing on social-choice-theoretic models from the emerging theory of 
judgment aggregation (e.g., List and Pettit 2002, 2004; Pauly and van Hees 
2005; Dietrich 2005; Bovens and Rabinowicz 2005; List 2005a, 2005b, 2006), I 
address two questions. First, how can we model a group of individuals as a 


' Collective cognition is “[a] special case of distributed cognition, in which two or 
more individuals reach a cognitive outcome simply by combining individual knowledge 
not initially shared with others” (Giere 2002, 641). 

? Knorr Cetina also studies research practices in molecular biology, but argues that 
here research is more individualized than in high energy physics and “the person remains 
the epistemic subject” (217, cited in Giere 2002). Giere (2002, especially 643) responds 
that, while there may be less collective cognition in molecular biology than in high energy 
physics, there may still be distributed cognition, “where the cognition is distributed 
between an individual person and an instrument”. 

3 The title of the report is “Climate Change Impacts on the United States: The 
Potential Consequences of Climate Variability and Change”. See http://www.usgcrp.gov/. 
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distributed cognitive system? And, second, can a group acting as a distributed 
cognitive system be rational and track the truth in its cognitive outputs? 

I argue that a group’s performance as a distributed cognitive system 
depends crucially on its organizational structure, and a key part of that 
organizational structure is the group’s ‘aggregation procedure’, as defined in 
social choice theory. An ‘aggregation procedure’ is a mechanism a multi- 
member group can use to combine (‘aggregate’) the judgments or repre- 
sentations held by the individual group members into judgments or repre- 
sentations endorsed by the group as a whole. I investigate the ways in which a 
group’s aggregation procedure affects its capacity to be rational and to track 
the truth in the outputs it produces as a distributed cognitive system. 

My discussion is structured as follows. I begin with some introductory 
remarks about modelling a group as a distributed cognitive system in section 2 
and introduce the concept of an aggregation procedure in section 3. The core 
of my discussion consists of sections 4 and 5, in which I discuss a group’s 
capacity to be rational and to track the truth in its cognitive outputs, 
respectively. In section 6, I draw some conclusions. 


2 Modelling a group as a distributed cognitive system 


When does it make sense to consider a group of individuals as a distributed 
cognitive system rather than a mere collection of individuals? First, the group 
must count as a well-demarcated system, and, second, it must count as a 
system that produces cognitive outputs. 

The first condition is met if and only if the group’s collective behaviour is 
sufficiently integrated. A well organized expert panel, a group of scientific 
collaborators or the monetary policy committee of a central bank, for 
example, may have this property, whereas a random crowd of people at 
London’s Leicester Square lacks the required level of integration. And the 
second condition is met if and only if the group is capable of producing 
outputs that have representational content; let me call these outputs ‘collec- 
tive judgments’. If a group’s organizational structure — e.g. its procedures for 
generating a joint report — allows the group to make certain joint declarations 
that count as collective judgments, then the group has this property, whereas a 
group without any formal or informal organization, such as a random crowd at 
Leicester Square, lacks the required capacity. 

At first sight, we may be reluctant to attribute judgments to groups over and 
above their individual members. But, as Goldman (2004, 12) has noted, in 
ordinary language, groups or collective organizations are often treated as 
subjects for the attribution of judgments. Goldman’s example is the recent 
debate on what the FBI as a collective organization did or did not “know” 
prior to the terrorist attacks of 9/11. In addition to the literature on distributed 
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cognition, there is now a growing literature in philosophy that considers 
conditions under which groups are sufficiently integrated to produce outputs 
that we normally associate with rational agency (e.g., Rovane 1998; Pettit 
2003; List and Pettit 2005a, 2005b). Roughly, a sufficient level of integration is 
given in those cases in which it is pragmatically and explanatorily useful to 
describe the group’s outputs in intentional terms (Dennett 1987), namely as 
the group’s ‘beliefs’, ‘judgments’, ‘commitments’ or ‘knowledge’. Arguably, 
this condition is satisfied by those groups that Hutchins, Giere, Knorr-Cetina 
and others have described as distributed cognitive systems. 

In short, a necessary condition for distributed cognition in a group is the 
presence of an organizational structure that allows the group to produce 
collective judgments, i.e., collective outputs with representational content. 
Once this necessary condition is met, the group’s performance as a distributed 
cognitive system depends on the nature of that organizational structure. 

Consequently, to construct a model of a group as a distributed cognitive 
system, we need to represent not only the individual group members, but also 
the group’s organizational structure. In the next section, I illustrate how we 
can think about this organizational structure in terms of a simple social- 
choice-theoretic model. 


3 The concept of an aggregation procedure 


How can we think about a group’s organizational structure? Let me introduce 
the concept of an ‘aggregation procedure’ to represent (a key part of) a 
group’s organizational structure. As defined in the theory of judgment 
aggregation (List and Pettit 2002, 2004; List 2006), an aggregation procedure 
is a mechanism by which a group can generate collective judgments on the 
basis of the group members’ individual judgments (illustrated in table 1). 
Formally, an aggregation procedure is a function which assigns to each com- 
bination of individual judgments across the group members a corresponding 
set of collective judgments. A simple example is ‘majority voting’, whereby a 
group judges a given proposition to be true whenever a majority of group 
members judges it to be true. Below I discuss several other aggregation 
procedures. 

Of course, an aggregation procedure captures only part of a group’s 
organizational structure (which may be quite complex), and there are also 
multiple ways (both formal and informal ones) in which a group might 
implement such a procedure. Nonetheless, as argued below, aggregation 
procedures are key factors in determining a group’s performance as a dis- 
tributed cognitive system. 

In the next section, I ask what properties a group’s aggregation procedure 
must have for the group to be rational as a distributed cognitive system — 
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Table 1 An aggregation procedure 


Input (individual judgments) 


I Aggregation procedure 


Output (collective judgments) 


specifically, consistent, but also complete, in its collective judgments — and in 
the subsequent section, I ask what properties it must have for the group to 
track the truth in these judgments. Both discussions illustrate that a group’s 
performance as a distributed cognitive system depends on its aggregation 
procedure. 


4 Rationality in a distributed cognitive system 


Suppose a group is given a cognitive task involving the formation of collective 
judgments on some propositions. Can the group ensure the consistency of 
these judgments? 


4.1 A majoritarian inconsistency 


Consider an expert committee that has to prepare a report on the health 

consequences of air pollution in a big city, especially pollution by particles 

smaller than 10 microns in diameter. This is an issue on which there has 

recently been much debate in Europe. The experts have to make judgments 

on the following propositions: 

D: The average particle pollution level exceeds 50 ugm~? (micrograms 
per cubic meter air). 

p—q: If the average particle pollution level exceeds 50 ugm~3, then resi- 
dents have a significantly increased risk of respiratory disease. 

q: Residents have a significantly increased risk of respiratory disease. 


All three propositions are complex factual propositions on which the experts 
may disagree.* Suppose the group uses majority voting as its aggregation 
procedure, i.e. the collective judgment on each proposition is the majority 


4 Propositions p and p —> q can be seen as ‘premises’ for the ‘conclusion’ q. Deter- 
mining whether p is true requires an evaluation of air quality measurements; determining 
whether p—q is true requires an understanding of causal processes in human physiol- 
ogy; finally, determining whether q is true requires a combination of the judgments on p 
and p— q. 
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judgment on that proposition, as defined above. Now suppose the experts’ 
individual judgments are as shown in table 2. 


Table 2 A majoritarian inconsistency 


p p->q q 
Individual 1 True True True 
Individual 2 True False False 
Individual 3 False True False 
Majority True True False 


Then a majority of experts judges p to be true, a majority judges p — q to be 
true, and yet a majority judges q to be false, an inconsistent collective set of 
judgments. The expert committee fails to be rational in the collective judg- 
ments it produces as a distributed cognitive system. 

This problem — sometimes called a ‘discursive dilemma’ - illustrates that, 
under the initially plausible aggregation procedure of majority voting, a group 
acting as a distributed cognitive system may not achieve consistent collective 
judgments even when all group members hold individually consistent judg- 
ments (Pettit 2001; List and Pettit 2002, 2004; List 2006; the problem originally 
goes back to the so-called ‘doctrinal paradox’ first identified by Kornhauser 
and Sager 1986). 

Is the present example just an isolated artefact, or can we learn something 
more general from it? 


4.2 An impossibility theorem 


Consider again any group of two or more individuals that is given the cog- 
nitive task to form collective judgments on a set of non-trivially inter- 
connected propositions, as in the expert committee example.’ Call an agent’s 
judgments on these propositions ‘complete’ if, for each proposition-negation 
pair, the agent judges either the proposition or its negation to be true; and call 


> Following List [2006], a set of propositions is ‘non -trivially interconnected’ if it is of 
one of the following forms (or a superset thereof): (i) it includes k > 1 propositions p4, ..., 
px and either their conjunction ‘p; and ... and pẹ or their disjunction ‘p, or p or ... or pg 
or both (and the negations of all these propositions); (ii) it includes k > 1 propositions p4, 
w+) Pi another proposition q and either the proposition ‘q if and only if (p, and ... and p,)’ 
or the proposition ‘q if and only if (pı or p or ... or pY or both (and negations); (iii) it 
includes propositions p, q and p—q (and negations). 
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these judgments ‘consistent’ if the set of propositions judged to be true by the 
agent is a consistent set in the standard sense of propositional logic.° 

Suppose now that each individual holds complete and consistent judgments 
on these propositions, and that the collective judgments are also required to 
be complete and consistent. One can then prove the following impossibility 
result (for a discussion of parallels and disanalogies between this result and 
Arrow’s (1951) classical theorem, see List and Pettit 2004 and Dietrich and 
List 2005a). 

Theorem (List and Pettit 2002). There exists no aggregation procedure 
generating complete and consistent collective judgments that satisfies the 
following three conditions simultaneously: 

Universal domain. The procedure accepts as admissible input any logically 
possible combinations of complete and consistent individual judgments on the 
propositions. 

Anonymity. The judgments of all individuals have equal weight in deter- 
mining the collective judgments. 

Systematicity. The collective judgment on each proposition depends only on 
the individual judgments on that proposition, and the same pattern of 
dependence holds for all propositions. 

In short, majority voting is not the only aggregation procedure that runs 
into problems like the one illustrated in table 2 above. Any procedure sat- 
isfying universal domain, anonymity and systematicity does so. If these con- 
ditions are regarded as indispensable requirements on an aggregation pro- 
cedure, then one has to conclude that a multi-member group acting as a 
distributed cognitive system cannot ensure the rationality of its collective 
judgments. But this conclusion would be too quick. The impossibility theorem 
should be seen as characterizing the logical space of aggregation procedures 
(List and Pettit 2002; List 2006). In particular, we can characterize different 
aggregation procedures in terms of which conditions they meet and which 
they violate. 

If a group acting as a distributed cognitive system seeks to ensure the 
rationality of its collective judgments, the group must use an aggregation 
procedure that violates at least one of the conditions of the theorem. 


4.3 First solution: relaxing universal domain 
If the amount of disagreement in a particular group is limited or if the group 


has mechanisms in place for reducing disagreement — such as mechanisms of 
group deliberation — the group might use an aggregation procedure that 


é This consistency notion is stronger than that in List and Pettit (2002). But when the 
present consistency notion is used, no deductive closure requirement needs to be added. 
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violates universal domain. For example, a deliberating group that successfully 
avoids combinations of individual judgments of the kind in table 2 might use 
majority voting as its aggregation procedure and yet generate rational col- 
lective judgments. 

But this solution does not work in general. Even in an expert committee 
whose task is to make judgments on factual matters without conflicts of 
interest, disagreement may still be significant and pervasive. Although one 
can study conditions that make the occurrence of judgment combinations of 
the kind in table 2 less likely (Dryzek and List 2003; List 2002), I set this issue 
aside here and assume that groups that are faced with primarily cognitive 
tasks (as opposed to primarily political ones, for example) should normally 
use aggregation procedures satisfying universal domain. 


4.4 Second solution: relaxing anonymity 


It can be shown that, if anonymity is relaxed but the other two conditions are 
retained, the only possible aggregation procedure is a ‘dictatorial procedure’, 
whereby the collective judgments are always those of some antecedently fixed 
group member (the ‘dictator’) (Pauly and van Hees 2005). Some groups might 
put one individual - say a committee chair - in charge of forming its collective 
judgments. But this solution clearly conflicts with the idea of collectively 
distributed cognition, and as discussed below, a group organized in this dic- 
tatorial way loses out on the epistemic advantages of distributed cognition. 
However, below I also suggest that a group acting as a distributed cognitive 
system may sometimes benefit from relaxing anonymity together with sys- 
tematicity and implementing a division of cognitive labour whereby different 
components of a complex cognitive task are allocated to different subgroups. 


4.5 Third solution: relaxing systematicity 


A potentially promising solution lies in relaxing systematicity, i.e., treating 
different propositions differently in the process of forming collective judg- 
ments. For the purposes of a given cognitive task, a group may designate some 
propositions as ‘premises’ and others as ‘conclusions’ and assign epistemic 
priority either to the premises or to the conclusions (for a more extensive 
discussion of this process, see List 2006). 

If the group assigns priority to the premises, it may use the so-called 
‘premise—based procedure’, whereby the group first makes a collective 
judgment on each premise by taking a majority vote on that premise and then 
derives its collective judgments on the conclusions from these collective 
judgments on the premises. In the expert committee example, propositions p 
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and p — q might be designated as premises (perhaps on the grounds that p and 
p—q are more basic than q), and proposition q might be designated as a 
conclusion. The committee might then take majority votes on p and p — q and 
derive its judgment on q from its judgments on p and p > q.’ 

Alternatively, if the group assigns priority to the conclusions, it may use the 
so-called ‘conclusion — based procedure’, whereby the group takes a majority 
vote only on each conclusion and makes no collective judgments on the 
premises. In addition to violating systematicity, this aggregation procedure 
fails to produce complete collective judgments. But sometimes a group is 
required to make judgments only on conclusions, but not on premises, and in 
such cases incompleteness in the collective judgments on the premises may be 
defensible. 

The premise- and conclusion-based procedures are not the only aggregation 
procedures violating systematicity. Further important possibilities arise when 
both systematicity and anonymity are relaxed. The group can then use an 
aggregation procedure that not only assigns priority to the premises, but also 
assigns different such premises to different subgroups and thereby implements 
a particularly clear form of distributed cognition. Specifically, the group may 
use the so-called ‘distributed premise-based procedure’. Here different 
individuals specialize on different premises and give their individual judg- 
ments only on these premises. Now the group makes a collective judgment on 
each premise by taking a majority vote on that premise among the relevant 
‘specialists’, and then the group derives its collective judgments on the con- 
clusions from these collective judgments on the premises. This procedure is 
discussed in greater detail below. 

For many cognitive tasks performed by groups, giving up systematicity and 
using a (regular or distributed) premise-based or conclusion-based procedure 
may be an attractive way to avoid the impossibility result explained above. 
Each of these procedures allows a group to produce rational collective 
judgments. Arguably, a premise-based or distributed premise-based procedure 
makes the group’s performance as a unified cognitive system particularly 
visible. A group using such a procedure acts as a reason-driven system when it 
derives its collective judgments on conclusions from its collective judgments 
on relevant premises. 

However, giving up systematicity comes with a price. Aggregation proce- 
dures that violate systematicity may be vulnerable to manipulation by pri- 
oritizing propositions strategically, and strategic agents with agenda-setting 
influence over the group might exploit these strategic vulnerabilities. 


7 In the present example, the truth-value of q is not always settled by the truth-values 
of p and p—q; so the group may need to stengthen its premises in order to make them 
sufficient to determine its judgment on the conclusion. 
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For example, in the case of a regular premise-based procedure, the collec- 
tive judgments may be sensitive to the choice of premises. In the example of 
table 2, if p and p —q are designated as premises, then all three propositions, 
p, p > q and q, are collectively judged to be true. If p and q are designated as 
premises, then p is judged to be true and both q and p— q are judged to be 
false; finally, if q and p — q are designated as premises, then p — q is judged to 
be true, and both p and q are judged to be false. Although there seems to be a 
natural choice of premises in the present example, namely p and p — q, this 
may not generally be the case, and the outcome of a premise-based procedure 
may therefore depend as much on the choice of premises as it depends on the 
individual judgments to be aggregated. In the present example, an environ- 
mental activist may prefer to prioritize the propositions in such a way as to 
bring about the collective judgment that proposition q is true, while a trans- 
port lobbyist may prefer to prioritize them in such a way as to bring about the 
opposite judgment on q. 

Under the distributed premise-based procedure, an additional sensitivity to 
the choice of ‘specialists’ on each premise arises. Likewise, in the case of the 
conclusion-based procedure, the choice of conclusions obviously matters, 
since the group makes collective judgments only on these conclusions and on 
no other propositions.® 


4.6 Fourth solution: permitting incomplete collective judgments 


The first three solutions proposed in response to the impossibility theorem 
above have required relaxing one of the three minimal conditions on how 
individual judgments are aggregated into collective judgments. The present 
solution preserves these minimal conditions, but weakens the requirements on 
the collective judgments themselves by permitting incompleteness in these 
judgments (see also List 2006). 

If a group acting as an overall cognitive system is prepared to refrain from 
making a collective judgment on some propositions — namely on those on 
which there is too much disagreement between the group members - then it 
may use an aggregation procedure such as the ‘unanimity procedure’, whereby 
the group makes a judgment on a proposition if and only if the group mem- 
bers unanimously endorse that judgment. Propositions judged to be true by all 
members are collectively judged to be true; and ones judged to be false by all 
members are collectively judged to be false; no collective judgment is made 
on any other propositions. (Instead of the unanimity procedure, the group 


8 It can be shown that in some important respects, the premise-based procedure is 
more vulnerable to strategic manipulation than the conclusion-based procedure. See 
Dietrich and List (2005b). 
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might also use ‘supermajority voting’ with a sufficiently large supermajority 
threshold.) 

Groups operating in a strongly consensual manner may well opt for this 
solution, but in many cases making no judgment on some propositions is 
simply not an option. For example, when an expert committee is asked to give 
advice on a particular issue, it is usually expected to take a determinate stance 
on that issue. 


4.7 Lessons to be drawn 


I have shown that a group’s capacity to form rational collective judgments 
depends on the group’s aggregation procedure: a group acting as a distributed 
cognitive system can ensure the rationality of its collective judgments on some 
non-trivially interconnected propositions only if it uses a procedure that 
violates one of universal domain, anonymity or systematicity or that produces 
incomplete collective judgments. Moreover, different aggregation procedures 
may lead to different collective judgments for the same combination of 
individual judgments. As an illustration, table 3 shows the collective judg- 
ments for the individual judgments in table 2 under different aggregation 
procedures. 

If we were to assess a group’s performance as a distributed cognitive system 
solely on the basis of whether the group’s collective judgments are rational, 
this would give us insufficient grounds for selecting a unique aggregation 
procedure. As I have illustrated, many different aggregation procedures 
generate consistent collective judgments, and even if we require completeness 
in addition to consistency, several possible aggregation procedures remain. To 


Table 3 Different aggregation procedures applied to the individual judgments 


in table 2 
P p>q q 

Majority voting* True True False 
Premise-based procedure with p, True True True 
p — q as premises 
Conclusion-based procedure with q No judgment No judgment False 
as conclusion 
Distributed premise-based procedure True False False 
with individual 1 specializing on p 
and individual 2 specializing on p — q 
Unanimity procedure No judgment No judgment No judgment 
Dictatorship of individual 3 False True False 


* inconsistent 
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recommend a suitable aggregation procedure that a group can use for a given 
cognitive task, the question of whether the group produces rational collective 
judgments is, by itself, not a sufficient criterion. 


5 Truth-tracking in a distributed cognitive system 


Can a group acting as a distributed cognitive system generate collective 
judgments that track the truth? Following Nozick (1981), a system ‘tracks the 
truth’ on some proposition p if two conditions are met. First, if — actually or 
counterfactually — p were true, the system would judge p to be true. Second, if 
— actually or counterfactually — p were not true, the system would not judge p 
to be true. These conditions can be applied to any cognitive system, whether it 
consists just of a single agent or of multiple agents acting together. In par- 
ticular, if a group’s organizational structure allows the group to form collec- 
tive judgments, then one can ask whether these judgments satisfy Nozick’s two 
conditions. 

As a simple measure of how well a system satisfies Nozick’s two conditions, 
I consider two conditional probabilities (List 2006): the probability that the 
system judges p to be true given that p is true, and the probability that the 
system does not judge p to be true given that p is false. Call these two con- 
ditional probabilities the system’s ‘positive’ and ‘negative reliability’ on p, 
respectively. 

By considering a group’s positive and negative reliability on various prop- 
ositions under different aggregation procedures and different scenarios, I now 
show that it is possible for a group acting as a distributed cognitive system to 
track the truth, but that, once again, the aggregation procedure affects the 
group’s success. 


5.1 The first scenario and its lesson: epistemic gains from democratization 


Suppose that a group is given the cognitive task of making a collective 
judgment on a single factual proposition, such as proposition p in the expert 
committee example above. As a baseline scenario (e.g., Grofman, Owen and 
Feld 1983), suppose that the group members hold individual judgments on 
proposition p, where two conditions are met. First, each group member has 
the same positive and negative reliability r on proposition p, where 1 >r> 1/2 
(the ‘competence’ condition); so individual judgments are noisy but biased 
towards the truth. Second, the judgments of different group members are 
mutually independent (the ‘independence’ condition). (Obviously, it is also 
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important to study scenarios where these conditions are violated, and below I 
consider some such scenarios.’) 

A group acting as a distributed cognitive system must use an aggregation 
procedure to make its collective judgment on p based on the group members’ 
individual judgments on p. What is the group’s positive and negative reliability 
on p under different aggregation procedures? 

Let me compare three different procedures: first, a dictatorial procedure, 
where the collective judgment is always determined by the same fixed group 
member; second, the unanimity procedure, where agreement among all group 
members is necessary for reaching a collective judgment; and third, majority 
voting, which perhaps best implements the idea of a democratically organized 
form of distributed cognition (at least in the case of a single proposition). 

Under a dictatorial procedure, the group’s positive and negative reliability 
on p equals that of the dictator, which is r by assumption. 

Under the unanimity procedure, the group’s positive reliability on p equals 
r", which approaches 0 as the group size increases, but its negative reliability 
on p equals 1 — (1 — r)", which approaches 1 as the group size increases. This 
means that the unanimity procedure is good at avoiding false positive judg- 
ments, but bad at reaching true positive ones. A determinate collective 
judgment on p is reached only if all individuals agree on the truth-value of p; if 
they don’t agree, no collective judgment on p is made. 

Finally, under majority voting, the group’s positive and negative reliability 
on p approaches 1 as the group size increases. Why does this result hold? Each 
individual has a probability r> 0.5 of making a correct judgment on p; by the 
law of large numbers, the proportion of individuals who make a correct 
judgment on p approaches r>0.5 as the group size increases and thus con- 
stitutes a majority with a probability approaching 1. Informally, majority 
voting allows the group to extract the signal from the group members’ judg- 
ments, while filtering out the noise. This is the famous ‘Condorcet jury the- 
orem’ (e.g., Grofman, Owen and Feld 1983). 

Table 4 shows the group’s positive and negative reliability on p under 
majority voting and under a dictatorial procedure, and tables 5 and 6 show, 
respectively, the group’s positive and negative reliability on p under a dicta- 
torial procedure and under the unanimity procedure. In each case, individual 
group members are assumed (as an illustration) to have a positive and neg- 


° Cases where different individuals have different levels of reliability are discussed, for 
example, in Grofman, Owen and Feld (1983) and Borland (1989). Cases where there are 
dependencies between different individuals’ judgments are discussed, for example, in 
Ladha (1992), Estlund (1994) and Dietrich and List (2004). Cases where individuals 
express their judgments strategically rather than truthfully are discussed in Austen-Smith 
and Banks (1996). 
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Table 4 The group’s positive and negative reliability on p: majority voting (top 
curve); dictatorship (bottom curve) (setting r = 0.54 as an illustration) 


200 400 800 1000 


Table5 The group’s positive reliability on p: dictatorship (top curve); 
unanimity procedure (bottom curve) (setting r = 0.54 as an illustration) 
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Table 6 The group’s negative reliability on p: unanimity procedure (top curve); 
dictatorship (bottom curve) (setting r = 0.54 as an illustration) 


ative reliability of r= 0.54 on p. In all tables, the group size is on the horizontal 
axis and the group’s reliability on the vertical axis.! 

What lessons can be drawn from this first scenario? If individuals are 
independent, fallible, but biased towards the truth, majority voting outper- 
forms both dictatorial and unanimity procedures in terms of maximizing the 
group’s positive and negative reliability on p. The unanimity procedure is 
attractive only in those special cases where the group seeks to minimize the 
risk of making false positive judgments, such as in some jury decisions. A 
dictatorial procedure fails to pool the information held by different individ- 
uals. 

Hence, when a group acting as a distributed cognitive system seeks to track 
the truth, there may be ‘epistemic gains from democratization’, i.e. from 
making a collective judgment on a given proposition democratically by using 
majority voting. More generally, even when individual reliability differs 
between individuals, a weighted form of majority voting still outperforms a 
dictatorship by the most reliable individual: each individual’s vote simply 


10 The present curves are the result of averaging between two separate curves for 
even- and odd-numbered group sizes. When the group size is an even number, the 
group’s reliability may be lower because of the possibility of majority ties. 
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needs to have a weight proportional to /og(r/(1 —r)), where r is the individ- 
ual’s reliability on the proposition in question (Ben-Yashar and Nitzan 1997). 


5.2 The second scenario and its lesson: epistemic gains from disaggregation 


Suppose now that a group is given the cognitive task of making a collective 
judgment not only on a single factual proposition, but on a set of inter- 
connected factual propositions. As an illustration, suppose that there are k > 1 
premises p4, ..., p and a conclusion q, where q is true if and only if the con- 
junction of pj, ..., Px is true. This structure also allows representing a variant of 
the expert committee example above. For extensive discussions of the present 
scenario and other related scenarios, see Bovens and Rabinowicz (2005) and 
List (2005a, 2006). Analogous points apply to the case where q is true if and 
only if the disjunction of p,, ..., Px is true. 

In this case of multiple interconnected propositions, individuals cannot 
generally have the same reliability on all propositions. Suppose, as an illus- 
tration, that each individual has the same positive and negative reliability r on 
each premise pı, .... p, and makes independent judgments on different 
premises. Then each individual’s positive reliability on the conclusion q is 7%, 
which is below r and often below 0.5 (whenever r < k-th root of 0.5), while his 
or her negative reliability on q is above r. Here individuals are much worse at 
detecting the truth of the conclusion than the truth of each premise, but much 
better at detecting the falsehood of the conclusion than the falsehood of each 
premise. In the expert committee example, it might be easier to make correct 
judgments on propositions p and p — q than on proposition q. Of course, other 
scenarios can also be constructed, but the point remains that individuals 
typically have different levels of reliability on different propositions (List 
2006). 

What is the group’s positive and negative reliability on the various propo- 
sitions under different aggregation procedures? As before, suppose the 
judgments of different group members are mutually independent. 

Majority voting performs well only on those propositions on which 
individuals have a positive and negative reliability above 0.5. As just argued, 
individuals may not meet this condition on all propositions. Moreover, 
majority voting does not generally produce consistent collective judgments 
(on the probability of majority inconsistencies, see List 2005a). Let me now 
compare dictatorial, conclusion-based and premise-based procedures. 

Under a dictatorial procedure, the group’s positive and negative reliability 
on each proposition equals that of the dictator; in particular, the probability 
that all propositions are judged correctly is r*, which may be very low, espe- 
cially when the number of premises k is large. 
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Table 7 The group’s probability of judging all propositions correctly: premise- 
based procedure (top curve); dictatorship (bottom curve) (setting r = 0.54 as an 
illustration) 


Under the conclusion-based procedure, unless individuals have a high 
reliability on each premise, namely r > k-th root of 0.5 (e.g. 0.71 when k=2, or 
0.79 when k=3), the group’s positive reliability on the conclusion q 
approaches 0 as the group size increases. Its negative reliability on q 
approaches 1. Like the unanimity procedure in the single-proposition case, the 
conclusion-based procedure is good at avoiding false positive judgments on 
the conclusion, but (typically) bad at reaching true positive ones (see also 
Bovens and Rabinowicz 2005). 

Under the premise-based procedure, the group’s positive and negative 
reliability on every proposition approaches 1 as the group size increases. This 
result holds because, by the Condorcet jury theorem as stated above, the 
group’s positive and negative reliability on each premise pı, ..., Px approaches 
1 with increasing group size, and therefore the probability that the group 
derives a correct judgment on the conclusion also approaches 1 with 
increasing group size. 

As illustration, suppose that there are k=2 premises and individuals have a 
positive and negative reliability of r= 0.54 on each premise. Table 7 shows the 
group’s probability of judging all propositions correctly under the premise- 
based procedure and under a dictatorial procedure. Tables8 and 9 show, 
respectively, the group’s positive and negative reliability on the conclusion q 
under a dictatorial procedure and under the conclusion-based procedure. 
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Table 8 The group’s positive reliability on the conclusion q: dictatorship (top 
curve); conclusion-based procedure (bottom curve) (setting r = 0.54 as an 


illustration) 
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Table 9 The group’s negative reliability on the conclusion q: conclusion-based 
procedure (top curve); dictatorship (bottom curve) (setting r = 0.54 as an 


illustration) 
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What lessons can be drawn from this second scenario? Under the present 
assumptions, the premise-based procedure outperforms both dictatorial and 
conclusion-based procedures in terms of simultaneously maximizing the 
group’s positive and negative reliability on every proposition. Like the una- 
nimity procedure before, the conclusion-based procedure is attractive only 
when the group seeks to minimize the risk of making false positive judgments 
on the conclusion; again, a dictatorial procedure is bad at information pooling. 

Hence, if a larger cognitive task such as making a judgment on some con- 
clusion can be disaggregated into several smaller cognitive tasks such as 
making judgments on relevant premises, then there may be ‘epistemic gains 
from disaggregation’, i.e. from making collective judgments on that conclusion 
on the basis of separate collective judgments on those premises. (For further 
results and a discussion of different scenarios, see Bovens and Rabinowicz 
2005 and List 2006.) 


5.3 The third scenario and its lesson: epistemic gains from distribution 


When a group is faced with a complex cognitive task that requires making 
judgments on several propositions, different members of the group may have 
different levels of expertise on different propositions. This is an important 
characteristic of many committees, groups of scientific collaborators, large 
organizations, and so on. Moreover, each individual may lack the temporal, 
computational and informational resources to become sufficiently reliable on 
every proposition. If we take this problem into account, can we improve on 
the premise-based procedure? 

Suppose, as before, that a group has to make collective judgments on k > 1 
premises py, ..., p, and a conclusion q, where q is true if and only if the con- 
junction of p4, ..., px is true. Instead of requiring every group member to make 
a judgment on every premise, we might partition the group into k subgroups 
(for simplicity, of approximately equal size), where the members of each 
subgroup specialize on one premise and make a judgment on that premise 
alone. Instead of a using a regular premise-based procedure as in the previous 
scenario, the group might now use a distributed premise-based procedure: the 
collective judgment on each premise is made by taking a majority vote within 
the subgroup specializing on that premise, and the collective judgment on the 
conclusion is then derived from these collective judgments on the premises. 

When does the distributed premise-based procedure outperform the regular 
premise-based procedure at maximizing the group’s probability of making 
correct judgments on the propositions? 

Intuitively, there are two effects here that pull in opposite directions. First, 
there may be ‘epistemic gains from specialization’: individuals may become 
more reliable on the proposition on which they specialize. But, second, there 
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may also be ‘epistemic losses from lower numbers’: each subgroup voting on a 
particular proposition is smaller than the original group (it is only approx- 
imately 1/k the size of the original group when there are k premises), which 
may reduce the benefits from majoritarian judgment aggregation on that 
proposition. 

Whether or not the distributed premise-based procedure outperforms the 
regular premise-based procedure depends on which of these two opposite 
effects is stronger. Obviously, if there were no epistemic gains from special- 
ization, then the distributed premise-based procedure would suffer only from 
losses from lower numbers on each premise and would therefore perform 
worse than the regular premise-based procedure. On the other hand, if the 
epistemic losses from lower numbers were relatively small compared to the 
epistemic gains from specialization, then the distributed premise-based pro- 
cedure would outperform the regular one. The following result holds: 

Theorem. For any group size n (divisible by k), there exists an individual 
(positive and negative) reliability level r* >r such that the following holds: if, 
by specializing on some proposition p, individuals achieve a reliability above 
r* on p, then the majority judgment on p in a subgroup of n/k specialists (each 
with reliability r* on p) is more reliable than the majority judgment on p in the 
original group of n non-specialists (each with reliability r on p). 

Hence, if by specializing on one premise, individuals achieve a reliability 
above r* on that premise, then the distributed premise-based procedure 
outperforms the regular premise-based procedure. How great must the reli- 
ability increase from r to r* be to have this effect? Strikingly, a small reliability 
increase typically suffices. Table 10 shows some sample calculations. For 
example, when there are k=2 premises, if the original individual reliability 
was r=0.52, then a reliability above r* = 0.5281 after specialization suffices; it 
was r=(.6, then a reliability above r* = 0.6393 after specialization suffices. 


Table 10 Reliability increase from r to r* required to outweigh the loss from 
lower numbers 


k=2,n=50 k=3,n=51 k=4,n=52 


— 0.52 0.6 0.75 0.52 0.6 0.75 0.52 0.6 0.75 
r*= 0.5281 0.6393 0.8315 0.5343 0.6682 0.8776 0.5394 0.6915 0.9098 


Table 11 shows the group’s probability of judging all propositions correctly 
under regular and distributed premise-based procedures, where there are k = 
2 premises and where individuals have positive and negative reliabilities of r= 
0.54 and r* =0.58 before and after specialization, respectively. 
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Table 11 The group’s probability of judging all propositions correctly: dis- 
tributed (top curve) and regular premise-based procedure (bottom curve) 
(setting r = 0.54 and r* = 0.58 as an illustration) 
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What lessons can be drawn from this third scenario? Even when there are 
only relatively modest gains from specialization, the distributed premise- 
based procedure may outperform the regular premise-based procedure in 
terms of maximizing the group’s positive and negative reliability on every 
proposition. 

Hence there may be ‘epistemic gains from distribution’: if a group has to 
perform a complex cognitive task, the group may benefit from subdividing the 
task into several smaller tasks and distributing these smaller tasks across 
multiple subgroups. Plausibly, such division of cognitive labour is the mech- 
anism underlying the successes of collectively distributed cognition in science, 
as investigated by Knorr Cetina (1999), Giere (2002) and others. The research 
practices in large-scale collaborative research projects, such as those in high- 
energy physics or in other large expert teams as mentioned above, rely on 
mechanisms similar to those represented, in a stylized form, by the distributed 
premise-based procedure. 

In conclusion, a group acting as a distributed cognitive system can succeed 
at tracking the truth, but the group’s aggregation procedure plays an impor- 
tant role in determining the group’s success. 
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6 Concluding remarks 


I have discussed collectively distributed cognition from a social-choice-theo- 
retic perspective. In particular, I have introduced the emerging theory of 
judgment aggregation to propose a way of modelling a group as a distributed 
cognitive system, i.e. as a system that can generate collective judgments. 
Within this framework, I have asked whether such a group can be rational and 
track the truth in its collective judgments. My main finding is that a group’s 
performance as a distributed cognitive system depends crucially on its 
aggregation procedure, and I have investigated how the aggregation proce- 
dure matters. 

With regard to a group’s rationality as a distributed cognitive system, I have 
discussed an impossibility theorem by which we can characterize the logical 
space of aggregation procedures that a group can use to generate rational 
collective judgments. No aggregation procedure generating consistent and 
complete collective judgments can simultaneously satisfy universal domain, 
anonymity and systematicity. To find an aggregation procedure that produces 
rational collective judgments, it is therefore necessary to relax one of universal 
domain, anonymity or systematicity, or to weaken the requirement of ration- 
ality itself by permitting incomplete collective judgments. Which relaxation is 
most defensible depends on the group and cognitive task in question. 

With regard to a group’s capacity to track the truth as a distributed cog- 
nitive system, I have identified three effects that are relevant to the design of a 
good aggregation procedure: there may be epistemic gains from democra- 
tization, disaggregation and distribution. Again, the applicability and magni- 
tude of each effect depends on the group and cognitive task in question, and 
there may not be a ‘one size fits all’ aggregation procedure that is best for all 
groups and all cognitive tasks. But the fact that a group may sometimes 
benefit from the identified effects reinforces the potential of epistemic gains 
through collectively distributed cognition. 

The present results give a fairly optimistic picture of a group’s capacity to 
perform as a distributed cognitive system. I have thereby focused on coop- 
erative rather than competitive practices within groups or communities. It is 
an important empirical question how pervasive such cooperative practices are 
and how often the favourable conditions such practices require are met. 
Clearly, scientific communities are characterized by both competitive and 
cooperative practices. Much research in the sociology and economics of sci- 
ence has focused on competitive practices (as evidenced by the theme of the 
2005 Conference on New Political Economy). There has also been much 
research on rationality failures and inefficiencies that can arise in groups 
trying to perform certain tasks at a collective level. Public choice theorists, in 
particular, have highlighted the impossibility results on democratic aggrega- 
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tion and the pervasiveness of suboptimal equilibria in various collective 
interactions. 

Clearly, the details of my rather more optimistic results depend on various 
assumptions and may change with changes in these assumptions. But my aim 
has not been to argue that all groups acting as distributed cognitive systems 
perform well; indeed, this claim is likely to be false. Rather, my aim has been 
to show that successful distributed cognition in a group is possible and to 
illustrate the usefulness of the theory of judgment aggregation for inves- 
tigating how it is possible and under what conditions. 
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Comment by 


SIEGFRIED K. BERNINGHAUS 


1 Brief summary of List’s contribution 


From a general point of view the contribution by Christian List is concerned 
with such types of social processes which can be called cognitive and dis- 
tributed. Distributed cognition is not a new field which is applied for the first 
time to problems in social science in this paper. However, more often we 
observe ideas of distributed cognition in various and different fields of 
research, for example in the field of artificial intelligence or distributed 
computing. 

To learn some basic facts about distributed cognition, we need not even go 
deeply into the artificial intelligence literature. In a nutshell, we can observe 
most aspects and problems of distributed cognition in our university system 
itself. In some sense the numerous committees established in universities, for 
example, committees on allocating the university budget, or committees on 
establishing new studies, can be regarded as expert panels representing dis- 
tributive intelligence. There is a strong connection of List’s theoretical paper 
with practical problems of scientific competition which is the main topic of 
this conference. 

In his paper, Christian List discusses distributed cognition from the par- 
ticular perspective of Social Choice Theory. He basically refers to results on 
group aggregation procedures in judgement aggregation. Such procedures are 
used to combine individual beliefs and judgements of the members of a group 
into collective beliefs and judgements. We know from the pioneering work by 
Christian List himself and other authors (see, for example, List and Pettit 
(2002), List and Pettit (2004), Dietrich and List (2004) that judgement 
aggregation procedures may suffer from serious inconsistencies in the col- 
lective decisions, i.e., that a group may not achieve consistent judgements 
although all group members individually hold consistent beliefs. A famous 
impossibility result shows that consistent collective judgements are not pos- 
sible provided the aggregation procedure satisfies some mild conditions.! 
These conditions are universal domain (any logically possible combination of 
personal sets of judgements is admissible as aggregation input), anonymity 
(collective set of judgements are invariant under any permutation of the 
individuals) and systematicity (collective judgement depends exclusively on 


! There exist some analogies to Arrow’s famous impossibility theorem on the aggre- 
gation of individual preferences (for details see List and Pettit 2004). 
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the pattern of individual judgements). In the literature on judgement aggre- 

gation one can find various strategies to escape from this impossibility result. 

One strategy could be to give up some of the basic requirements on the 

aggregation rule or to consider incomplete judgements. 

In his paper, List deals with the problem whether collective beliefs or 
judgements constitute group knowledge. He considers the positive and neg- 
ative reliability on propositions under various different aggregation proce- 
dures and scenarios. Suppose that each group member has a certain degree of 
reliability on a factual proposition p. 

1. In scenario 1, the effect of three different aggregation rules on the groups 
reliability on the same factual proposition is investigated. 

2. In scenario 2, the focus is on the collective judgement on a collection of 
interconnected factual propositions. 

3. Scenario 3 is concerned with gains from specialization which may arise 
from splitting up the original group into expert panels who are responsible 
for making judgements only on a small subset of interconnected proposi- 
tions. 


2 Questions and comments 


List’s approach to judgement aggregation is an interesting and innovative 
contribution to Social Choice. Compared with most papers presented at this 
conference, this contribution is rather abstract. However, this may not be a 
disadvantage at all. Quite to the contrary, this paper contributes substantially 
to basic research in group rationality and, therefore, to basic research in the 
theory of scientific competition, too. 


2.1 General comments 


1. In the paper, institutional design is more or less identified with the judge- 
ment aggregation rule. I think this is a rather narrow interpretation of the 
processes taking place in institutions. Of course, aggregation rules are the core 
part of institutional design but important strategic aspects are missing. Why 
don’t group members try to manipulate either the aggregation procedure by 
itself or why don’t they communicate with other group members to make 
bargained arrangements? 

2. In the logical framework of List and Pettit, the individuals do not express 
preferences (like in Arrow’s Social Choice framework) but make statements 
about their beliefs in the truth of propositions. Can we interpret the individual 
reliability probabilities p as degrees of confirmation (in the sense of Carnap’s 


Distributed Cognition 311 


Inductive Logic”)? Then the p’s could differ even in a group of homogenous 
individuals because of different individually accumulated empirical evidence 
(for the truth of a proposition). Therefore, the judgement aggregation prob- 
lem could be transformed into a belief aggregation problem. 

Belief aggregation problems were considered, for example, by DeGroot 
(1974), who proved in an elegant approach (via Markovian process argu- 
ments) that the weights, which each group member attaches to the subjective 
beliefs of the remaining group members converge to a common weight 
scheme which may generate common subjective beliefs in the group. 


2.2 Specific comments 


1. First comment on scenario 1: The most important conclusion of this sce- 
nario is that there exist highest epistemic gains from democratization 
(majority voting) when compared with two alternative aggregation proce- 
dures (dictatorial and unanimity). 

Is a set of (two) alternatives really large enough to draw definitive con- 
clusions on the superiority of majority voting? Don’t there exist many more 
voting procedures? 

2. Second comment on scenario 1: Results in this scenario are derived from 
rather restrictive formal assumptions on the majority voting aggregation 
procedure. More concretely, group members must have identical reliability in 
judging the truth of a proposition and, moreover, they should act independ- 
ently from each other. 

The results on majority voting are based on the simple relation 


Veroup = Prob > X; > (m+1) \) 


where X; € {0,1} denotes the random variable that group member i judges 
the proposition in question as being true and n=2m-+1 is the group size 
which is supposed to be odd. The group’s reliability r,,,, on a proposition can 
then be calculated explicitly when the X; are stochastically independent and 
its limit can be determined when the group size increases. 

List himself mentions that these restrictive assumptions on judgement 
aggregation can be relaxed without changing his results. Boland (1989), for 
example, shows that the aggregation results on group reliability still hold when 
the group members’ independence assumption is substituted by a less 
restrictive assumption postulating the existence of an opinion leader in the 


? That is, we define the reliability p of group member to judge a proposition to be true 
as the degree of confirmation c(h|e) of proposition h supported by accumulated 
empirical evidence e (see Carnap and Jeffrey 1971). 
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group. There is not a unique way to escape from the independence assump- 
tion. Many alternative assumptions modelling dependence between group 
member in judgement aggregation exist. Some may be more and some may be 
less reasonably be applied to this particular Social Choice framework. In the 
following, I would like to suggest two interesting extensions of the inde- 
pendence assumptions. 

a) Suppose the {X;}; are exchangeable random variables, i.e. 


Prob HA =X, A = Xn ) = Prob Aa = Xa) wry E) 


for any permutation z: {1,...,n} — {1,...,n} and x; € {0,1}. 

Exchangeability has an interesting interpretation: According to De Finetti’s 
theorem (combined with Aldous’s results 1985) on finite exchangeable ran- 
dom variables the reliability of the judgements of the group members 


Prob (IX =, ..., X,=Xn}) 


can be (approximately) regarded as a “mixture” of the judgements of inde- 
pendently judging group members, where the weights are determined by 
“collective events” which concern the whole group. In other words, we still 
assume some type of independence in judgement making on the individual 
level which, however, has to be conditioned on collective events.’ 

I am not sure how the results of majority voting in the modified model with 
exchangeable agents will change. Because of the equivalence of exchange- 
ability and conditional independence, I would conjecture that List’s results 
will remain valid conditionally on the occurrence of specific collective events. 

b) It is reasonable to assume that all human groups are composed of 
individuals having many social interactions with their neighbors. In other 
words, each group can be characterized by a social network structure which has 
an important impact on individual judgements. As an illustrative example of a 
simple network structure see Figure 1, where each group member is connected 
to 4 other group members (one neighbor on the right, one on the left, one 
above, and one below). 

Being connected to some group members can be interpreted in this 
framework as being influenced by the judgement of the neighbors. 

Formally, a group’s judgement configuration in period ¢ can be defined as a 
mapping 

n, : S > {0,1}, 


where S denotes the sites of a graph which represents the local interaction 
structure imposed on the whole group.‘ There exist many models in the lit- 


3 In technical terms, collective events are elements of the terminal o-algebra generated 
by the random variables {X}}; 
4 In the example presented in Figure 1, sites of the interaction graph are the points 1-9. 
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Figure 1 2-dimensional local interaction 
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erature (for example, “voter models”, “contagion models”, see Liggett 1985, 
Morris 2000) dealing with the evolution of decision making in groups with 
social network structures. In a voter model, for example, it is assumed that the 
rate at which each group member x € S flips from judging a proposition being 
true to being false is given by i D pp—xict) ang. In the voter model, one is 
interested in the temporal evolution of the probabilities of judgement con- 
figurations 7, when t increases. In our simple 2-dimensional social network the 
process {7,}, has two limit distributions. Either we have 7*(-)=1 with prob- 
ability equal to one or we have 7*(-)=0 with probability equal to one. In 
higher-dimensional social network structures some non-trivial results (with 
n*(-)+0 or=1) hold.’ 

In the voter model, emphasis is laid on the temporal evolution of judge- 
ments in a group of infinitely many members. Another view on the impact of 
social interaction structures would be to start from a finite group with a given 
social interaction structure and let the number of participants go to infinity. 
However, we cannot go into the details here. 


Summarizing, I believe that there exist a lot of ways to get rid of the 
independence assumption in List’s aggregation procedure. It would be inter- 


5 Note that in order to derive these results, one has to assume that the group size is 
infinite. 
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esting to see how these alternative assumptions would change the results in 
List’s paper. 
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